Sunday, August 28, 2016

Apple Store Passbook UML Diagrams and Error Messages

While working on a recent project, a major stumbling block was a lack of clear documentation of what happened where. This was confirmed when I attempted to search for some of the messages returned to the Log REST points by iPhone.. There were zero hits!

 

image

 

In terms of a Store Card, let us look at the apparent Sequence Diagram

 

image

 

Log Errors Messages Seen and Likely Meaning

  • Passbook Inactive or Deleted or some one changed Auth Token
    • [2016-08-28 11:57:01 -0400] Unregister task (for device ceed8761e584e814ed4fe73cbb334ee9, pass type pass.com.reddwarfdogs.card.dev, serial number 85607BFE98D91A-765F7B05-D5E4-4B32-B16D-69C2038EF522; with web service url https://llc.reddwarfdogs.com/passbook) encountered error: Authentication failure
    • [2016-08-28 20:44:25 +0700] Register task (for device 19121d6b570b31a3fa56dbd45411c933, pass type pass.com.reddwarfdogs.card.dev, serial number 85607BFE98D91A-765F7B05-D5E4-4B32-B16D-69C2038EF522; with web service url https://llc.reddwarfdogs.com/passbook) encountered error: Authentication failure
    • [2016-08-24 10:04:38 +0800] Web service error for pass.com.reddwarfdogs.card.dev (https://llc.reddwarfdogs.com/passbook): Update requested for unregistered serial number 8C6772F099D51AA3-7A32F5FB-F7F8-4285-A2A2-79FC66DF942C
  • Bad Record Keeping in your application
    • [2016-08-23 19:58:35 -0700] Web service error for pass.com.reddwarfdogs.card.dev (https://llc.reddwarfdogs.com/passbook): Server ignored the 'if-modified-since' header (Tue, 23 Aug 2016 16:54:10 GMT) and returned the full unchanged pass data for serial number '8C6771F89ED51DAA-AAF3100E-C365-4CCD-8C95-ADC974F52894'.
    • [2016-08-23 16:49:38 -0700] Get pass task (pass type pass.com.reddwarfdogs.card.dev, serial number 8C6771F89ED31FAE-57ED753A-8464-408E-95EF-CEF75DBB30D6, if-modified-since Tue, 09 Aug 2016 21:57:32 GMT; with web service url https://llc.reddwarfdogs.com/passbook) encountered error: Received invalid pass data (The pass cannot be read because it isn’t valid.)
      • Cause: Corruption OR change of Certificate used to sign Passbook
    • [2016-08-23 13:56:44 -0700] Web service error for pass.com.reddwarfdogs.card.dev (https://llc.reddwarfdogs.com/passbook): Server requested update to serial number '8C6771F89ED41BAC-FFBF3B69-98F1-4F2A-A8B7-5AF457558EE7', but the pass was unchanged.
    • [2016-08-23 11:58:25 -0700] Web service error for pass.com.reddwarfdogs.card.dev (https://llc.reddwarfdogs.com/passbook): Device received spurious push. Request for passesUpdatedSince '20160823180851' returned no serial numbers. (Device = 2c04d18e5f8480f97bb9318b4065dba0)
    • [2016-08-08 10:23:57 -0700] Web service error for pass.com.reddwarfdogs.card.dev (https://llc.reddwarfdogs.com/v1/passbook): Response to 'What changed?' request included 1 serial numbers but the lastUpdated tag (20160808172351) remained the same.
      • Cause: Duplicate push notification sent to a device or logic error. If the tag is   1234, then the server logic should be > 1234 and NOT >=1234
  • Apple gives little guidance to status code and how the iphone will react
    • [2016-08-23 15:46:33 +0700] Get serial #s task (for device 6f175696d73dec465c561f4d3ee2dfe7, pass type pass.com.reddwarfdogs.card.dev, last updated (null); with web service url https://llc.reddwarfdogs.com/passbook) encountered error: Unexpected response code 504
    • [2016-08-23 01:42:53 -0700] Get serial #s task (for device 2c04d18e5f8480f97bb9318b4065dba0, pass type pass.com.reddwarfdogs.card.dev, last updated 20160823083910; with web service url https://llc.reddwarfdogs.com/passbook) encountered error: Unexpected response code 408
    • [2016-08-08 18:53:00 +0800] Get serial #s task (for device 726996d0f44f44b19f157aa0824f64cf, pass type pass.com.reddwarfdogs.card.dev, last updated (null); with web service url https://llc.reddwarfdogs.com/passbook) encountered error: Unexpected response code 596

I suspect there are more messages – I have just not stumbled across them yet.

Friday, August 26, 2016

Solving PushSharp.Apple Disconnect Issue

While doing a load test of a new Apple Passbook application, I suddenly saw some 200K transmissions errors from my WebApi application. Searching the web I found that a “high” rate of connect/disconnect to Apple Push Notification Service being reported as causing APNS to do a forced disconnect.

 

While Apple does have a limit (very very high) on the number of notifications before they will refuse connections for an hour, the limit for connect/disconnect is much lower. After some playing around a bit I found that if I persisted the connection via a static, I no longer have this issue.

 

Below is a sample of the code.

  • Note: we disconnect and reconnect whenever an error happens (I have not seen an error yet) 

 

using Newtonsoft.Json.Linq;

using PushSharp.Apple;

using System;

using System.Collections.Generic;

using System.Security.Cryptography.X509Certificates;

using System.Text;

namespace RedDwarfDogs.Passbook.Engine.Notification

{

    public class AppleNotification : INotification

    {

        private readonly IPassbookSettings _passbookSettings;

        private readonly ILogger_logger;

        private static ApnsServiceBroker _apnsServiceBroker;

        private static object lockObject = new object();

        public AppleNotification(ILogger logger,IPassbookSettings passbookSettings)

        {

            _logger= Guard.EnsureArgumentIsNotNull(logger, "logger");

            _passbookSettings = Guard.EnsureArgumentIsNotNull(passbookSettings, "passbookSettings");

        }

        public void SendNotification(HashSet<string> deviceTokens)

        {

            if (deviceTokens == null || deviceTokens.Count == 0)

            {

                return;

            }

            try

            {

                _logger.Write("PassbookEngine_SendNotification_Apple");

                // Create a new broker if needed

                if (_apnsServiceBroker == null)

                {

                    X509Certificate2 cert = _passbookSettings.ApplePushCertificate;

                    if (cert == null)

                        throw new InvalidOperationException("pushThumbprint certificate is not installed or has invalid Thumbprint");

                      var config = new ApnsConfiguration(ApnsConfiguration.ApnsServerEnvironment.Production,

                        _passbookSettings.ApplePushCertificate, false);

                    _logger.Write("PassbookEngine_SendNotification_Apple_Connect");

                    _apnsServiceBroker = new ApnsServiceBroker(config);

                    // Wire up events

                    _apnsServiceBroker.OnNotificationFailed += (notification, aggregateEx) =>

                    {

                        aggregateEx.Handle(ex =>

                        {

                            _logger.Write("Apple Notification Failed", "Direct", ex);

                            _logger.Write("PassbookEngine_SendNotification_Apple_Error");

                            // See what kind of exception it was to further diagnose

                            if (ex is ApnsNotificationException)

                            {

                                var notificationException = (ApnsNotificationException)ex;

                                var apnsNotification = notificationException.Notification;

                                var statusCode = notificationException.ErrorStatusCode;

                            }

                            _logger.Write("SendNotification", "PushToken Rejected", ex);

                            // We reset to null to recreate / connect

                            Restart();

                            return true;

                        });

                    };

                    _apnsServiceBroker.OnNotificationSucceeded += (notification) =>

                    {

                    };

                    // Start the broker

                }

                var sentTokens = new StringBuilder();

                lock (lockObject)

                {

                    _apnsServiceBroker.Start();

                    foreach (var deviceToken in deviceTokens)

                    {

                        if (string.IsNullOrWhiteSpace(deviceToken) || deviceToken.Length < 32 || deviceToken.Length > 256 || deviceToken.Contains("-"))

                        {

                            //Invalid Token, keep in Apple's good books                   

                            // We use GUID's thus - for faking pushtokens. Do not send them to apple

                            // We do not want to be get black listed

                        }

                        else

                        {

                            // Queue a notification to send

                            var nofification = new ApnsNotification

                            {

                                DeviceToken = deviceToken,

                                Payload = JObject.Parse("{\"aps\":{\"badge\":7}}")

                            };

                            try

                            {

                                _apnsServiceBroker.QueueNotification(nofification);

                                sentTokens.AppendFormat("{0} ", deviceToken);

                            }

                            catch (System.InvalidOperationException)

                            {

                                // Assuming already in queue

                            }

                        }

                    }

                    try

                    {

                        //duplicate signals may occur

                        _apnsServiceBroker.Stop();

                    }

                    catch { }

                }

                var auditLog = new Log

                {

                    Message = sentTokens.ToString(),

                    RequestHttpMethod = "Post"

                };

                _logger.Write("Passbook", PassbookLogMessageCategory.SendNotification.ToString(),

                    "PassbookAudit", "Passbook", auditLog);

                return;

            }

            catch (Exception exc)

            {

                // We swallow notification exceptions - for example APSN is off line. Allow rest of processing to work.

                _logger.Write("SendNotification", "One or more notifications via Apple (APNS) failed", exc);

                Restart();

                _apnsServiceBroker = null; //force a reset

            }

        }

        private void Restart()

        {

            if (_apnsServiceBroker != null)

            {

                try

                {

                    //duplicate signals may occur

                    _apnsServiceBroker.Stop();

                }

                catch { }

                _logCounterWrapper.Increment("PassbookEngine_SendNotification_Apple_Restart");

                _apnsServiceBroker = null;

            }

        }

    }

}

Sunday, August 7, 2016

Taking Apple PkPasses In-House–Working Notes

This year I had a explicit, yet vague, project assigned to me: Move our Apple PkPass from a third party provider to our own internal system. The working environment was the Microsoft Stack with C# and a little googling found that the first 90% of the work could be done by nuget, namely:

  • Install-Package dotnet-passbook
  • Install-Package PushSharp

Created a certificate file on the apple developer site and we are done … easy project… not quite

 

Unfortunately both in-house expertise and 3rd part expertise involved in the original project had moved on. Welcome to reverse engineering black boxes.

 

The Joy of Certificates!

Going to http://www.apple.com/certificateauthority/  open a can of worms. The existing instructions assumed you have a Mac not Windows 10.

The existing instructions found on the web(https://tomasmcguinness.com/2012/06/28/generating-an-apple-ios-certificate-using-windows/)  broke due to some change with Windows or Apple in April 2016 ( apple forum, stack overflow). The solution was Unix on windows via https://cygwin.com/install.html and going the unix route to generate pfx files.

 

The second issue was connected with how we run our IIS servers and the default instructions for installing certificate for dotnet-passbook were not mutually compatible. The instructions said that the certs needed to be install in the Intermediate Certification Authorities – after a few panic hours deploying to load hosts with problems, we discovered that we had to Import to Personal to get dotnet-passbook to work.

The next issue we encountered was that of invisible characters coming along when we copy the thumbprint to our C# code. We implemented a thumbprint check that verified both the length (40) and also walk the characters insuring that all were in range. After this, we verified that we could find the matching certificate. All of this was done on website load. . an error was thrown, the site would not load.

 

This saved us triage time on every new deployment:with an

  • We identify if a thumbprint is ‘corrupt’
  • We verified that the expected certificate is there

The last issue impacts big shops: The certificate should be 100% owned by Dev Ops and never installed on a dev or test machine. This means that alternative certs are needed in those environment. Each cert with have a different thumbprint – hence lots of web.config transformation substituting in the correct thumbprint for the environment. The real life production cert should be owned by dev ops (or security)  with a very strong password that they and they alone know.

 

The Joys of Authentication Tokens

Security review for in-house required that the authentication tokens be a one way hash (SHA384 or higher) and be unique per PkPasses. The existing design used Guids for serial numbers and thus we used a Guid for the authentication token when the pass was first created.  We can never recreate an existing PkPass because we do not know the authentication token, just the hash.  When a request comes in for the latest path, we hash the authentication token sent in the authentication header and compare it to the hash. We then persist it in memory and insert it into the PkPass Json,  then we Zip and Sign the new PkPass.  Security is happy.

 

Now when it comes to the 3rd party provider, we were fortunate that they stored the authentication tokens in plain text, so it was just a hash and save the hash into our database. If they had hashed (as they should have), then we would need to replicate their hash method. If it was a SHA1 and SHA-2 was required by our security, then we would need to do some fancy footwork to migrate the hash, i.e.

  1. add a “SHA” column iWn our table,
  2. when a new request comes in examine the SHA value
  3. if it is “1” then use the authentication token presented and authenticated to create a SHA-2 hash and update the SHA column to “2”
  4. if it is “2” then authenticate appropriately.

This will allow us to track the uplift rate to SHA-2. At some point security would likely say “delete the SHA1 PkPass records”. This is easy because we have tracked them.

 

Push Notifications

This went easy except for missing that a Push Certificate is NOT used for PKPass files. Yes, it is not used.  It is used for registered 3rd party developed Apple applications. The certificate used for connecting to the Apple Push Notification Service (APNS) is the certificate used to sign the PkPass files. There is no separate push notification certificate. Also, using PushSharp, you must set “validate certificate” to false, or an exception will be thrown.

 

The pushTokens are device identifiers and APNS does not provide feedback if the device still exists (one of my old phones exists, but is at the bottom of an outdoor privy in a national park…), is turned off, or is out of communication.  The author of PushSharp, Redth, has done an excellent description of the problem here. The logical way to keep the history in check is to track when each pass is last retrieved and then periodically delete the push notifications for devices where none of the associated passes have been retrieved in the last year.  You will have “dead” push tokens in some circumstances.

 

I have a pkPass, my iPhone got destroyed. I installed the pkPass on the new phone. The old iPhone push token will never be eliminated while I maintain my PkPass. Why? because we do not know which iPhone is getting updates!

 

Minor hiccup

The get serial number since API call had a gotcha dealing with modified since query parameters. Apple documentation suggest that a date be used and we originally code it up assuming that this was a http if-modified-since header. QAing on a iPhone clarified that it was a query parameter and not a http header. We simply moved the same date there and encountered two issues:

  • We had a time-offset issue, our code was working off the database local time and our code deeming it to be universal time…. (which a http header would be)
  • Our IIS security settings did not like seeing a “:” in a query parameter. We resolved by used “yyyyMMddHHmmss” format

The real gotcha that was stated in the apple documentation was that this is an arbitrary token  that is daisy chained from one call to the next. It did not need to be a date. A date is a logical choice, but it is not required to be a date.

 

The value received in the last get serial numbers response is what is sent in the next get serial numbers request. Daisy chaining. The iPhone does nothing but echo it back.

Avoiding a Migraine

The dotnet-passbook code puts into the Json, the pass type identifier name in the certificate regardless of what you passed in. This is good and wise and secure. It has an unfortunate side effect, the routing

devices/{deviceLibraryIdentifier}/registrations/{passTypeIdentifier} and passes/{passTypeIdentifier}/{serialNumber}

is determined by this pass type identifier. If you are running a site and passes come from passes/foobar/1234, but your certificate name is “JackShyte” then the Json in the pass returned would read JackShyte. When the iPhone gets a push token, it would then construct the url for the update as passes/JackShyte/1234 … which will likely return a 404. The PkPass will never be updated unless you create additional routings!!

 

The solution that I took was to compare the {passTypeIdentifier} in the routing to the certificate. If they did not match, then 404 immediately and log an exception. While it is technically possible to “unwind” such a foul up, the path is not pretty.

 

Migration

The key for migration is a stepped approach

  1. Deploy your new solution and test it, correct any issues that you find in the production environment
  2. Deploy the application or mechanism for creating new PkPasses (this could be part of 1), so all new passes use the in-house system
  3. Update your data from the third party provider with authentication tokens (or their hash) and serial numbers. You want to do this after 2, because you want this list to be closed (no new passes created on the third party system)
  4. Have the 3rd party provider change the WebServiceUrl to the in-house solution. In theory, a Moved response to the in house system would also work (I have not tested this with an iPhone).
  5. Since the 3rd party wants to shut down in time, then you must send out a push notification to every push token you have.  You will likely want to throttle this if you have a large numbers of push tokens (in my case, 30 million) because every push token could result in a request for a new PkPass file.
    1. This may need to be repeated to insure adequate coverage for devices off line or abroad without data plans

Bottom Line

The original design worked, but there was a ton of details that had to be sorted out. I have omitted the nightmares that QA had trying to validate stuff, especially the migration portions.