James Stanley


Protohackers problem 2 retrospective

Sun 11 September 2022
Tagged: protohackers

I released Protohackers problem 2 on Thursday evening. The problem asks you to implement a server that stores timestamped price data and lets clients query the mean price over custom time ranges. (This post contains potential spoilers for the problem; if you have not solved it yet, and you would like to solve it, then to avoid disappointment you should not read this until after you've solved it!).

So far, 499 people have signed up for the site (+437 since last time). 199 have solved the test problem (+170) and 76 have solved the latest problem (+62).

The reason behind the big increase in users is that there was a thread about Protohackers on the HN front page for a little while.

The problem statement

So far there has only been one point of confusion highlighted in the problem statement (a big improvement over last time!).

The problem statement said:

Because each client's data represents a different asset, each client can only query the data supplied by itself.

To my mind it was obvious that each connection was a separate client, but some people interpreted it to mean that only unique client IP addresses are separate clients, and that every connection from the same IP address counts as the same client.

I've updated the problem text to say:

Each connection from a client is a separate session. Each session's data represents a different asset, so each session can only query the data supplied by itself.

Ideally the problem statement would never get changed for any reason, so that every user always sees the same text. However, where the problem statement is confusing, I consider that a bug, and bugs should be fixed.

Someone pointed out in the Discord chat that it is very weird for clients to be "analysing historical price data", but permanently lose access to it as soon as they disconnect. It is weird, I agree. The intention for this problem was that you need to persist state across multiple requests (compared to the previous problem where you didn't have to persist any state), but you do not have to share state between multiple clients.

Email deliverability

A handful of people have not been receiving the signup/login emails. Two American universities are bouncing the mails with a message about poor domain reputation. I have emailed postmaster@ their domains to ask to get unblocked.

Two (would-be) users have email on their own domains hosted by Fastmail, which seems to be silently blackholing the mails with no bounce message. I've opened a support ticket with Fastmail.

So I don't know if I will have to add some alternative non-email-based login flow, or whether the problem will get solved on its own over time as a.) I complain to people who are blocking the emails, and b.) the domain reputation improves as it ages.

Upcoming features

Overall leaderboard

I still haven't done the overall leaderboard. The current plan is that if N people have ever successfully solved any problem, then your rank for a problem you haven't solved yet will be N+1, and the overall leaderboard will be ranked by your summed rank, ascending.

(So if there are 3 problems, and you got 1st place on all 3, your score is 3. If you got 2nd place on all 3, your score is 6, which is higher than 3, which is a worse overall ranking).

I don't like the fact that in cases where people have not solved all problems, their leaderboard positions can be changed even if they don't do anything (just because new people solve some problems, which increases N), but I think it is better than any alternative I can think of, and in particular it is stable for those users that have solved all problems, and it means you always get an improvement in your overall score by solving a new problem.

I am keen not to create any overall leaderboard until the logic is definitely settled, because I don't want to give someone 1st place on the leaderboard and then change my mind and hand it to someone else!

IPv6

Several people have asked about IPv6, including one person who has a VPS that apparently only has IPv6 connectivity. So I plan to look at adding IPv6 support. I am wary that this will double the surface area of possible connectivity issues, but I think it is worth doing.

The most important part is that the checker can access people's servers using IPv6, but it would also be nice if the web site was available over IPv6.

Dashboard

I have an InkyWHAT screen hanging from a shelf next to my desk with a 3d-printed bracket:

It shows the total number of users, problem attempts, and problem solutions, as well as information about the most recent accepted solution, the current time, and the server's CPU load.

Mainly it just saves me from having to manually look this stuff up in the database. It gets its data from a special (authenticated) API endpoint.

If you have an InkyWHAT and you don't like how it totally fades to black every time you update the screen, you should use the InkyFast wrapper from pwnagotchi - it uses an alternative lookup table for the display driver so that it updates more quickly. It can leave a tiny amount of ghosting, but it's a big improvement over the whole display fading to black every 10 seconds.



If you like my blog, please consider subscribing to the RSS feed or the mailing list: