Sunday, August 7, 2016

Taking Apple PkPasses In-House–Working Notes

This year I had a explicit, yet vague, project assigned to me: Move our Apple PkPass from a third party provider to our own internal system. The working environment was the Microsoft Stack with C# and a little googling found that the first 90% of the work could be done by nuget, namely:

  • Install-Package dotnet-passbook
  • Install-Package PushSharp

Created a certificate file on the apple developer site and we are done … easy project… not quite

 

Unfortunately both in-house expertise and 3rd part expertise involved in the original project had moved on. Welcome to reverse engineering black boxes.

 

The Joy of Certificates!

Going to http://www.apple.com/certificateauthority/  open a can of worms. The existing instructions assumed you have a Mac not Windows 10.

The existing instructions found on the web(https://tomasmcguinness.com/2012/06/28/generating-an-apple-ios-certificate-using-windows/)  broke due to some change with Windows or Apple in April 2016 ( apple forum, stack overflow). The solution was Unix on windows via https://cygwin.com/install.html and going the unix route to generate pfx files.

 

The second issue was connected with how we run our IIS servers and the default instructions for installing certificate for dotnet-passbook were not mutually compatible. The instructions said that the certs needed to be install in the Intermediate Certification Authorities – after a few panic hours deploying to load hosts with problems, we discovered that we had to Import to Personal to get dotnet-passbook to work.

The next issue we encountered was that of invisible characters coming along when we copy the thumbprint to our C# code. We implemented a thumbprint check that verified both the length (40) and also walk the characters insuring that all were in range. After this, we verified that we could find the matching certificate. All of this was done on website load. . an error was thrown, the site would not load.

 

This saved us triage time on every new deployment:with an

  • We identify if a thumbprint is ‘corrupt’
  • We verified that the expected certificate is there

The last issue impacts big shops: The certificate should be 100% owned by Dev Ops and never installed on a dev or test machine. This means that alternative certs are needed in those environment. Each cert with have a different thumbprint – hence lots of web.config transformation substituting in the correct thumbprint for the environment. The real life production cert should be owned by dev ops (or security)  with a very strong password that they and they alone know.

 

The Joys of Authentication Tokens

Security review for in-house required that the authentication tokens be a one way hash (SHA384 or higher) and be unique per PkPasses. The existing design used Guids for serial numbers and thus we used a Guid for the authentication token when the pass was first created.  We can never recreate an existing PkPass because we do not know the authentication token, just the hash.  When a request comes in for the latest path, we hash the authentication token sent in the authentication header and compare it to the hash. We then persist it in memory and insert it into the PkPass Json,  then we Zip and Sign the new PkPass.  Security is happy.

 

Now when it comes to the 3rd party provider, we were fortunate that they stored the authentication tokens in plain text, so it was just a hash and save the hash into our database. If they had hashed (as they should have), then we would need to replicate their hash method. If it was a SHA1 and SHA-2 was required by our security, then we would need to do some fancy footwork to migrate the hash, i.e.

  1. add a “SHA” column iWn our table,
  2. when a new request comes in examine the SHA value
  3. if it is “1” then use the authentication token presented and authenticated to create a SHA-2 hash and update the SHA column to “2”
  4. if it is “2” then authenticate appropriately.

This will allow us to track the uplift rate to SHA-2. At some point security would likely say “delete the SHA1 PkPass records”. This is easy because we have tracked them.

 

Push Notifications

This went easy except for missing that a Push Certificate is NOT used for PKPass files. Yes, it is not used.  It is used for registered 3rd party developed Apple applications. The certificate used for connecting to the Apple Push Notification Service (APNS) is the certificate used to sign the PkPass files. There is no separate push notification certificate. Also, using PushSharp, you must set “validate certificate” to false, or an exception will be thrown.

 

The pushTokens are device identifiers and APNS does not provide feedback if the device still exists (one of my old phones exists, but is at the bottom of an outdoor privy in a national park…), is turned off, or is out of communication.  The author of PushSharp, Redth, has done an excellent description of the problem here. The logical way to keep the history in check is to track when each pass is last retrieved and then periodically delete the push notifications for devices where none of the associated passes have been retrieved in the last year.  You will have “dead” push tokens in some circumstances.

 

I have a pkPass, my iPhone got destroyed. I installed the pkPass on the new phone. The old iPhone push token will never be eliminated while I maintain my PkPass. Why? because we do not know which iPhone is getting updates!

 

Minor hiccup

The get serial number since API call had a gotcha dealing with modified since query parameters. Apple documentation suggest that a date be used and we originally code it up assuming that this was a http if-modified-since header. QAing on a iPhone clarified that it was a query parameter and not a http header. We simply moved the same date there and encountered two issues:

  • We had a time-offset issue, our code was working off the database local time and our code deeming it to be universal time…. (which a http header would be)
  • Our IIS security settings did not like seeing a “:” in a query parameter. We resolved by used “yyyyMMddHHmmss” format

The real gotcha that was stated in the apple documentation was that this is an arbitrary token  that is daisy chained from one call to the next. It did not need to be a date. A date is a logical choice, but it is not required to be a date.

 

The value received in the last get serial numbers response is what is sent in the next get serial numbers request. Daisy chaining. The iPhone does nothing but echo it back.

Avoiding a Migraine

The dotnet-passbook code puts into the Json, the pass type identifier name in the certificate regardless of what you passed in. This is good and wise and secure. It has an unfortunate side effect, the routing

devices/{deviceLibraryIdentifier}/registrations/{passTypeIdentifier} and passes/{passTypeIdentifier}/{serialNumber}

is determined by this pass type identifier. If you are running a site and passes come from passes/foobar/1234, but your certificate name is “JackShyte” then the Json in the pass returned would read JackShyte. When the iPhone gets a push token, it would then construct the url for the update as passes/JackShyte/1234 … which will likely return a 404. The PkPass will never be updated unless you create additional routings!!

 

The solution that I took was to compare the {passTypeIdentifier} in the routing to the certificate. If they did not match, then 404 immediately and log an exception. While it is technically possible to “unwind” such a foul up, the path is not pretty.

 

Migration

The key for migration is a stepped approach

  1. Deploy your new solution and test it, correct any issues that you find in the production environment
  2. Deploy the application or mechanism for creating new PkPasses (this could be part of 1), so all new passes use the in-house system
  3. Update your data from the third party provider with authentication tokens (or their hash) and serial numbers. You want to do this after 2, because you want this list to be closed (no new passes created on the third party system)
  4. Have the 3rd party provider change the WebServiceUrl to the in-house solution. In theory, a Moved response to the in house system would also work (I have not tested this with an iPhone).
  5. Since the 3rd party wants to shut down in time, then you must send out a push notification to every push token you have.  You will likely want to throttle this if you have a large numbers of push tokens (in my case, 30 million) because every push token could result in a request for a new PkPass file.
    1. This may need to be repeated to insure adequate coverage for devices off line or abroad without data plans

Bottom Line

The original design worked, but there was a ton of details that had to be sorted out. I have omitted the nightmares that QA had trying to validate stuff, especially the migration portions.

2 comments: