SDCC 2016 Post Hotel Lottery, A Statistical Analysis

By: ComicConDad

It has been almost 3 weeks since the 2016 San Diego Comic-Con hotel lottery, but many questions still remain. In this article we will look at what has been revealed after the event, examine some data supplied by FoCC members, and see what these items tell us and what they don’t. In contrast to last year, where the problems centered on the selection process itself, this year the issues were with communication. Both Comic-Con International and onPeak have admitted that there were problems with what information was given out, and FoCC received the following comment from David Glanzer, the Director of Marketing and Public Relations of Comic-Con International.

onPeak and Comic-Con work closely to figure out the details for the hotel sale. This was the first time using a new method for sales since onPeak and Travel Planners merged and we’ve gained a lot of knowledge on better communication between our organizations. The feedback from this sale has given us a lot of information and suggestions of how we can improve moving forward and we will, as we do with all things at Comic-Con, debrief and see how we can work to make the process more efficient.

So with that in mind, let’s dig in.

Overview

To recap the process, which was new this year, lottery participants were given a URL that directed them to a staging area. Participants were then assigned an id that would correspond to a random place in a queue. A very welcome change was the presence of an example form that people could look at and practice on once the staging area opened. This form did contain a watermark that could interfere with the working of the practice page, hopefully this will be fixed before next year. Once the lottery started, participants were granted access to the hotel request form according to their place in their queue. While waiting for the form, a queue number showed up in the browser window, decrementing periodically as one moved up in line. This update was sporadic, not continuous, and many people who got to the form quickly never saw a queue number at all. The form itself was similar to previous years, with the exception of a missing option where one could request any downtown hotel if they were not placed in one of their top 6 choices. The queue number and a unique identifier for the participant were tied to browser cookies, and since participants did not have to use any sort of registration or id to enter the lottery, (such as a Comic-Con International member ID) they could use different browsers or multiple browser users to create as many participating windows as they wanted, increasing their chances of getting a good place in line.

The lottery itself ran quite smoothly, though there were several reports of lag. This year there was no email summarizing what was entered in the submitted form, or indicating that the form submission had been successfully received. Several days after the lottery, emails started going out to some (very nervous) people informing them of their results, however many (even more nervous) people received no notification whatsoever.

A big unknown this year was what exactly would determine the order in which requests would be processed. Different criteria were mentioned at different times, including that the actual submission time would play a role. This is an important one to point out since it matches what had been done before and provides an incentive to fill out the form as quickly as possible. To make it even more confusing, once the form was submitted participants saw a message that said that requests would be randomized yet again.

Data and the TimeStamp

FoCC members volunteered data that allowed us to quickly determine how this part of the new process actually worked. This data described 216 distinct hotel lottery submissions, 119 of which contained timestamps and the result of the request. A comparison of times to result showed that the time a participant was granted access to the form was the key factor. We captured four distinct outcomes:

  • T- Top 6 choice
  • D- Downtown hotel that was not a top 6 pick
  • N- Non-downtown hotel that was not a top 6 pick
  • W- Waitlist notification

and this table shows they varied by the form access time:(Requests that were flagged as duplicates are not included in this data as we were unable to separate them from participants that shared some information with us but did not share a result.)

Comparison of Times to Result

Access Time T D N W
Under 1 Min 75% 12% 8% 4%
1-2 Min 58% 21% 15% 6%
2-3 Min 16% 15% 54% 15%
3-4 Min 0% 5% 56% 39%
4-5 Min 0% 1% 25% 74%
Over 5 Min 0% 0% 16% 84%
Overall 32% 9% 25% 34%

This matches a statement from Kristina Simkins, Director of Product Development at OnPeak.

The only timestamp used when processing reservations was the time at which the user was granted access to the form. Using this timestamp allowed us to keep users in the order

This quote, and others given below, are taken from Kerry Dixon’s excellent article, CCI, onPeak Offer Insight to San Diego Comic-Con General Hotel Sale 2016 Click Here to read the entire article on the SDCC Unoffical Blog.

The chances of receiving a top 6 hotel, or any hotel at all, dropped quickly as a function of time. There were anecdotal reports of requests that did not get access to the form until over 3 minutes but still got a top pick, but nobody supplied an actual description of such. The time that a participant took to fill out the form did not have an effect.

Speed and Queue numbers

We also saw that participants were let into the system quickly, at what looks like a constant rate. (This matches the functionality described in onQueue’s “High Availability” white paper, available on request from their website.) 21% of sessions received access to the form in each of the first 2 minutes of the lottery, after which the rate tapers off, possibly due to extra sessions being abandoned or to the rate being lowered due to server load.
Here are the percent of sessions given form access by minute:

  • Under 1 minute- 21%
  • 1 to 2 minutes- 21%
  • 2 to 3 minutes- 14%
  • 3 to 4 minutes- 9%
  • 4 to 5 minutes- 8%
  • over 5 minutes- 28%

Combining this with the previous set of numbers gives some interesting statistics. For example, 45% of sessions took longer than 3 minutes to get to the form, but had little to no chance at one of their top 6 choices and only a small chance at a downtown hotel.

Many people were able to report the first queue number that they saw, and lower numbers did correspond to faster access times. Unfortunately, the lack of precision in the queue numbers precludes analysis. The highest queue number reported was around 18700, much lower than the number of participant sessions, suggesting that multiple queues were used. This was another source of confusion, as people naturally interpreted the queue number as a count of how many total sessions were in front of them. A person might have seen a queue number of 5000 and been understandably confused if they were then waitlisted, not knowing that their queue was not the only one or that the actual queue number they started with might have been much higher but had dropped to 5000 by the time they first saw it.

Another change this year was a more aggressive approach in how submissions were flagged as duplicates, but without advance notification of what criteria would be used. When multiple submissions were determined to be duplicates, the last one was accepted and earlier ones discarded with no possibility of challenging the result. This resulted in situations where people had requests rejected that were essentially identical to successful ones submitted in past years.

Let’s compare this summary to two additional statements from Kristina Simkins:

Approximately 2.5 requests submitted per room available

Not only did more Comic-Con users get one of their top six choices than ever before (38% last year vs. 49% this year), but more people got their first hotel choice than last year (18% last year vs. 25% this year)

These were presented to give context about the size of the lottery and to support the claim that it was more successful than previous approaches. Unfortunately, while these statements sound good at first, when you examine them closely they don’t provide much information at all.

These kind of comments are quite common in business and politics and are the sort of thing that can engender distrust in statistics, (lies, damned lies, etc.,) if you aren’t extremely careful. For example, a section of road might be described as the most dangerous, in terms of accidents per year, but also as the safest, taking into account the number of cars that travel on it a year. Both views can be useful, the first might be important if you are evaluating whether the area has sufficient emergency services nearby, the latter if you were deciding where to budget money for improved safety features. The important thing is to be precise in what words are used, provide a complete picture, and use the statistics for descriptive rather than rhetorical purposes.

Ms. Simkins statements mention requests, rooms, Comic-Con users, and people. (The use of the term “Comic-Con users” is likely a misnomer, since there was no requirement that participants have a CCI account.) But in the actual lottery, there is a person in front of a computer who can have multiple browser sessions and submit multiple requests that can be for multiple rooms with multiple people hoping to stay in them. Also, many of the requests were discarded as duplicates so the 2.5 number could refer to total requests or successful requests, and the word requests could mean actual requests or requested rooms.

Similarly, the statistic on top choices and first choice mentions people, but is that actually requests or does it account for the actual number of people in each request? Does it include those that were waitlisted? Were the number and distribution of rooms comparable to last year? Another problem is the comparison with 2015, a year where the hotel sale had massive and manifold problems. It would be more useful to know precisely how these numbers are defined and how they have varied in recent years when the sale operated as expected. The introduction of randomization shouldn’t have had much effect on this statistic since there are just a few hotels that tend to be the top choices. As it stands it is impossible to know how much of the difference that is being claimed as an improvement is simply due to the lack of last year’s errors, was just random, or was due to other effects.

The lack of the option to be placed in any downtown hotel likely caused part of the increase, since there would have been cases where an early request would not have gotten a top pick and instead have been randomly placed into a non-downtown hotel, leaving downtown rooms available for later requests that had different hotels as a top pick.

Duplicates & Fairness

A more problematic possible factor is the change in how requests were flagged as duplicates. Since people who book in larger groups will be more likely to make multiple bookings, the new logic could well have skewed the numbers. Again it is impossible to tell without a more complete picture.

So was the lottery fair? The evidence given so far is problematic but we can look at one aspect of fairness. Given what we know, were there certain groups that were at a disadvantage?

The fact that the time to complete the form no longer matters certainly helped people who weren’t as experienced with the process, made a mistake or two (that’s me) or were booking for a larger group and needed time to enter the names. However, the lack of clear information on what determined one’s place in the processing means that people who had later times might not have made the best decisions on what hotels they entered in their list. Knowing what we know now, someone who didn’t get access until after 3 minutes might have chosen some hotels outside of downtown with preferable locations, or in a certain price range.

The new duplication logic disproportionately affected people booking in larger groups. Ironically, many of these will have been people who have attended Comic-Con previously and used a strategy based on what was allowed in previous years. Also, when requests were flagged as duplicates the lack of a notification meant people lost valuable time in securing another option for lodging.

People who used multiple browser sessions also had an advantage over those who did not, or who used fewer. In previous years, this would not have been a factor since the time it took to fill out the form was paramount. If this aspect of the lottery is to remain the same, then onPeak and CCI need to think through the consequences. A process exerts a pressure on how people use the system, and next year I would expect many more sessions to be in play.

Going Forward

So how can things improve? I do believe that onPeak and CCI want to make things fairer, but if that is the goal then changes in how things are run are only part of the solution. The lottery can’t be fair unless everyone has enough information to make smart and considered choices. Clear and consistent communication is just as important as the process itself. Let’s hope that in the future we are told what is important, what is allowed and what isn’t, what the numbers and statements we see during the lottery actually mean, and what we will be informed of given the result of our request.

Recording and sharing your experience will also help. Thanks to the people who took time to share their data this year we know more than we would have otherwise. Next year I hope we can better organize so than we can collect even more data. Every bit helps, and it is important not to have participation based on outcome. People tend to share great news and bad news, but it’s the regular, run of the mill outcomes that help complete the picture. No matter how much communication improves, there will always be some gaps and some ambiguity and your data can help.

Twitter: @comiccondad
Blog: https://comiccondad.wordpress.com/