7/L Segment Display

Today I was troubleshooting a sensor that was giving an error code — but Error 52 “Over Voltage” was not what I expected… regardless I checked the voltage and it looked fine. About ten minutes later I realized that the display was mounted upside down and it was Error 25, “Programming Error” — duh.

 

Untitled2.jpg

Which one? Choose wisely

 

While this is totally my fault it is also avoidable. So if you are a company that makes things with important codes using a seven segment display AND your display can logically be upside down then don’t use any numbers that are also valid in either orientation.

 

614BPv1jPaL._SY355_-314x252-1.jpg

 

Here are the amount of numbers that satisfy that criteria:

Figure_1.png

To find this I used the criteria of a number having to be not a number upside down (because of a 7, 4 or 3) or be itself when upside down (96 -> 96).

1 Digit Numbers
2 Digit Numbers
3 Digit Numbers
4 Digit Numbers
5 Digit Numbers

 

Here is the code:

import matplotlib.pyplot as plt

def flip(num):
    flips = [
        ["0","0"],
        ["1","1"],
        ["2","2"],
        ["3","E_"],
        ["4","H_"],
        ["5","5"],
        ["6","9"],
        ["7","L_"],
        ["8","8"],
        ["9","6"]]

    new_num = ""
    for i in range(len(num)):
        new_num += flips[int(num[len(num)-1-i])][1]
    if new_num == num or len(new_num) > len(num): return num
    else: return

def find_all_digits(digits):
    valid_options = []
    for i in range(0,10**digits):
        val = str(i)
        val = "0"*(digits-len(val))+val
        if flip(val) is not None:
            valid_options.append(flip(val))
    return len(valid_options)

x,y = [],[]
for i in range(1,9):
    x.append(i)
    y.append(find_all_digits(i)/10**i)

fig = plt.figure()
ax = fig.add_subplot(111)
plt.scatter(x,y)
for i,j in zip(x,y):
    ax.annotate("("+str(i)[:4]+","+str(int(round(j*10**i)))+")",xy=(i-0.075*len(str(round(j*10**i))),j+0.01))
plt.title("Number of valid numbers that aren't confusing when upside down")
plt.xlabel("N\n Number of Digits Used")
plt.ylabel('Valid Entries / Total Possible Entries')
plt.show()

 

Advertisements

Hearthstone Meets Math

There is a pretty rad game called Hearthstone — its a digital card game made by Blizzard. Similar to Magic the Gathering two players face off using (usually) preconstructed decks of creatures and spells to defeat the opponent.

There is a mechanic in this game called discover, in which you are presented with three choices and get to select one.

Choose_Your_Path(55598).png

For example, a card like Choose Your Path can be a “toolbox” card — if you draw it in the late game it can be a large creature or if you are getting swarmed by your opponent it can be an area-of-effect style spell. Getting this flexibility has costs, however, because it puts the selection into your hand (effectively taxing you one mana) and doesn’t guarantee you the spell you need. This flexibility makes it a powerful and skill-testing card.

Capture.PNG

Example selection choices from a “Discover” card

In the new set they revealed a new card: A New Challenger… and discussions cropped up about how good this card truly is

A_New_Challenger...(90173).png

Since most discover cards have the “toolbox” aspect to them we can’t easily rank the cards offered. However with this card you just want the biggest minion every time — which should be able to determine pretty accurately how good this card is.

To do this we need to use math. So to restate the problem:

Given a deck of size N with cards numbered 1-N, what is the expected value of the largest card in a three card hand?

 

We get that the expected value of a hand of h cards from a pool of p cards is

EV(\texttt{p,h})=\sum_{n=\texttt{p}}^{\texttt{h}}( C_{{(n)}(\texttt{p})}-C_{({n-1})(\texttt{p})})*n

 

This represents taking the difference between the number of combinations of a deck of cards of n-1 and n cards. Notably, all the cards that are added in these combinations will have the n in them — otherwise they could be represented with n-1! These hands will have n as their largest card (its the largest it knows) and so the value of the hand must be n.

The probability of getting a score n from a pool of p cards with a hand of size h is P(p,h,n), given by:

P(\texttt{p,h,}n)={(\dfrac{\texttt{h}}{n+1}})(1-\texttt{sum}(P(\texttt{p,h,}(n+1) \rightarrow \texttt{p})))

In essence, this says that the probability is the (hand_size / pool_of_cards) after you remove the probability that its a higher value P(p,h,(n+1 -> p))

figure_6.png

note that it shows the asymptote as being .755, it should be .75. This is the same output from the EV equation but divided by N to make it more comparable to one another

 

Okay so what does this tell us? The card will be, on average, 75% of the best card available.

Therefore, A New Challenger can be evaluated as getting you the 75th percentile of the ordered 6-Cost minions in the Hearthstone standard format with Taunt and Divine Shield.

 

Extensions:

 

What if Discover showed you more than 3 cards?

Logically, you would get a better card. Here is if you could see four cards:

figure_5.png

And here is a similar figure but showing for a large range of hand sizes:

Figure_7.png

This gives you the equation:

EV/N(\texttt{N})=(1-\dfrac{1}{\texttt{N}})

Here is the source code


import itertools, math
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
import numpy as np

def nCr(n,r):
f = math.factorial
return f(n) / f(r) / f(n-r)

def EV(h,d):
#Deck of size d, hand of size h
delta = 0
points = 0
total_combs = 0
for i in range(h,d+1):
delta = nCr(i,h) - total_combs
total_combs += delta
points += delta*i
return points / total_combs / d

def EV_recursive(d,h):
#Deck of size d, hand of size h
ev = 0
p_card = [0 for i in range(d)]

for n in range(d-1,0,-1): p_card[n] = (h/(n+1))*(1-sum(p_card[n+1:]))
for c in range(d): ev += p_card[c]*(c+1)

return ev / d

end_val_1 = EV_recursive(10000,3)
end_val_2 = EV_recursive(10000,4)

rolls = 3
max_sided_die = 40

y_1 = [EV(rolls,i) for i in range(rolls,max_sided_die)]
y_2 = [EV(4,i) for i in range(4,max_sided_die)]
x_1 = [i for i in range(rolls,max_sided_die)]
x_2 = [i for i in range(4,max_sided_die)]

fig = plt.figure()
ax = fig.add_subplot(111)

#Plotting 3 card hand
p = np.polyfit([0,1], [end_val_1,end_val_1], 1)
plt.plot(x_1,p[0]*np.log(x_1)+p[1],'--',color='blue')
ax.annotate("y = " + str(end_val_1)[:6],xy=(3,end_val_1+0.01))
plt.scatter(x_1,y_1, color="blue")
blue_patch = mpatches.Patch(color='blue', label='3 cards to select from')
orange_patch = mpatches.Patch(color='orange', label='4 cards to select from')

#Plotting 4 card hand
p = np.polyfit([0,1], [end_val_2,end_val_2], 1)
plt.plot(x_2,p[0]*np.log(x_2)+p[1],'--',color='orange')
ax.annotate("y = " + str(end_val_2)[:6],xy=(3,end_val_2+0.01))
plt.scatter(x_2,y_2, color = 'orange')

plt.legend(handles=[blue_patch,orange_patch])
plt.title('EV for selecting 3 or 4 cards from a deck of N cards\nnumbered 1-N and getting points equal to the largest one')
plt.xlabel("N\nnumber of cards")
plt.ylabel('EV / N\npercent of maximum score expected')
fig.show()

 

How much better is a 96mph fastball than a 95mph fastball?

While hiking a few months ago Jeff and I were talking about baseball and specifically, the fastball. To me, it feels like once you get into the upper 90’s the speeds seem to be idolized in a way that may not make sense. Does the actual velocity matter? Is it that much harder to bat against a ball going a mile per hour faster? In essence:

How much better is a 96mph fastball than a 95mph fastball?

 

Capture.JPG

I couldn’t find this anywhere so I took a stab at doing it myself. I looked at three stats:

  • Contact% = Pitches on which contact was made / Swings
  • Swing% = Swings / Pitches
  • SwStr% = Swings and misses / Total pitches

These stats were chosen because the made sense to me and could be calculated from pitches individually, as opposed to needing the data from the entire at bat.

 

For the data, I used data from Clayton Kershaw’s pitches over his entire MLB career (2008-present). Why Clayton Kershaw? Because I don’t know baseball very well and someone told me he throws both fast and a lot. This gave me ~16,000 data points — here is the results from that data:

 


Contact%:

contact_percent.png

Contact%, defined by [Pitches on which contact was made / Swings] goes down, as most people predicted. For every mph you increase the velocity the contact percentage drops an average of 2%, which seems significant. In essence, its harder to make contact with faster pitches.

 


Swing%:

swing_percentage.png

Swing%, defined by [Swings / Pitches] goes up, as everyone I asked predicted. For every mph you increase the velocity, batters swing about 3% more often. Looking at the range it shows you that the fastest of his pitches are swung at about 40% more than his slowest pitches, which is a huge gap. Batters swing more at faster pitches.

 


SwStr%:

swstr_percentage.png

SwStr%, defined by [Swings and misses / Total pitches] also goes up, as everyone I asked predicted. This is the most striking to me — batters swing and miss at balls on average twice as much when the fastball is 95mph compared to when its 92mph.

 


Conclusion:

Faster fastballs are better. The data we looked at clearly showed that all three of these stats get better for the pitcher as you increase the velocity. And the titular question can be answered now:

A 96mph fastball is going to be made contact with 2% less often while being swung at 3% more often and swung on and missed about 1.5% more often.

 


Notes:

The code can be found here. I used python 3.x with pandas and matplotlib to parse it.

Data points are taken by putting the slowest 200 pitches into a bucket. That bucket is then checked for the stat in question and the their velocities are averaged. The bucket size of 200 was chosen because it gave was the smallest number that showed the results without being unnecessarily noisy.

Obvious extensions would be doing this for other pitchers (does it extend beyond 96? Is every pitcher’s graph similar?) and other stats (Z/O-Swing, Z/O-Contact, basically everything here)

Continue reading

A Tale of Two Endpoints

Another week, another riddler. Here is this week’s problem:

You’ve just been hired to work in a juicy middle-management role at Riddler HQ — welcome aboard! We relocated you to a tastefully appointed apartment in Riddler City, five blocks west and 10 blocks south of the office. (The streets of Riddler City, of course, are laid out in a perfect grid.) You walk to work each morning and back home each evening. Restless and inquisitive mathematician that you are, you prefer to walk a different path along the streets each time. How long can you stay in that apartment before you are forced to walk the same path twice? (Assume you don’t take paths that are longer than required, and assume beaucoup bonus points for not using your computer.)

Extra credit: What if you instead took a bigger but more distant apartment, M blocks west and N blocks south of the office?

This problem is pretty easy if you are able to think about it from the right point of view.

What we want a list of directions, either S, south, or W, west. For example, one valid route is SSSSSSSSSSWWWWW (we can’t have E or N because they would make our route inefficient). Now, we want to reorder these to show every possible route. Using high school math, we know that permutations with repeated elements follows the form:

\frac{N!}{A! \times B! \times C!}

 

Where the set of \text{\small N} letters has \text{\small A} identical items, \text{\small B} identical items, \text{\small C} identical items, etc…

Using this we get our solution to be:

\frac{15!}{10! \times 5!} = \text{\small 3003 trips}

 

This can be confirmed with Python, which also gives us this heatmap of the traveler’s path

heatmap4.png

There ya go! You can work 1501 days, or about 4 years!

import matplotlib.pylab as plt
import seaborn as sns

south = 10
east = 5

distance = south + east
correct_route = []
heatmap = [[0 for e in range(max(south,east)+3)] for s in range(max(south,east)+3)]</pre>
for attempt in range(2**distance):
    route = bin(attempt)[2:]

    #if we go east exactly the right number of times then route is "correct". We do this by counting moves east
    easterness = 0
    for element in route: easterness += int(element)
    if easterness == east:

        #bin doesn't prepend 0's... we have to manually
        while len(route) != distance: route = '0'+route

        #Add route, then go through and add locations we step on to heatmap
        correct_route.append(route)
        location = [1,1+south]
        for element in route:
            heatmap[location[1]][location[0]] +=1
            if int(element) == 1: location[0] +=1
            else: location[1] -= 1
        heatmap[location[1]][location[0]] +=1

#Format and plot
with sns.axes_style("white"):
    ax = sns.heatmap(heatmap, vmax=3500, square=True,  cmap="YlGnBu", cbar = True)
    ax.axis('off')
    ax.legend().set_visible(False)

    ax.set_title("Heatmap of Commute")

    plt.show()

the eccentric billionaire and the banker

Another week, another riddler. Here is this week’s problem:

An eccentric billionaire has a published a devilish math problem that she wants to see solved. Her challenge is to three-color a specific map that she likes — that is, to color its regions with only three colors while ensuring that no bordering regions are the same color. Being an eccentric billionaire, she offers $10 million to anyone who can present her with a solution.

You come up with a solution to this math problem! However, being a poor college student, you cannot come up with the $10,000 needed to travel to the billionaire’s remote island lair. You go to your local bank and ask the manager to lend you the $10,000. You explain to him that you will soon be winning $10 million, so you will easily be able to pay back the loan. But the manager is skeptical that you actually have a correct solution.

Of course, if you simply hand the manager your solution, there is nothing preventing him from throwing you out of his office and collecting the $10 million for himself. So, the question is: How do you prove to the manager that you have a solution to the problem without giving him the solution (or any part of the solution that makes it easy for him to reproduce it)?

Oh boy, okay so here is how I do it. First, we look at the map and number it:

riddler1.jpg

Next, you and the banker come up with a contiguous path through the regions that goes through all the boarders that a region has (repeating is okay):

riddler6.jpg

Now look at the order of regions this creates (starting/ending at the green arrow):

1, 2, 5, 3, 5, 2, 3, 4, 3, 6, 5, 1, 6, 1

Then the banker leaves. We rotate the order of the path however we like

original order:

1, 2, 5, 3, 5, 2, 3, 4, 3, 6, 5, 1, 6, 1

rotated one order:

2, 5, 3, 5, 2, 3, 4, 3, 6, 5, 1, 6, 1, 2

rotated two order:

5, 3, 5, 2, 3, 4, 3, 6, 5, 1, 6, 1, 2, 5

We then put tokens face down corresponding to each region’s color in the rotated order we picked. In our example solution:

riddler2.jpg

If we used the “original order” the tokens would be:

(mind you they are face down)

R, Y, G, R, G, Y, R, Y, R, Y, G, R, Y, R”

riddler7.jpg

The banker is then brought back in and allowed to flip over any two adjacent tokens. If the rules are satisfied then you will never flip two of the same color.

This process of rotating the path order, setting up the tokens and having the banker flip two can be repeated as many times as required.

Game Theory on a Number Line

Another week, another Riddler. Here is a fun problem:

Ariel, Beatrice and Cassandra — three brilliant game theorists — were bored at a game theory conference (shocking, we know) and devised the following game to pass the time. They drew a number line and placed $1 on the 1, $2 on the 2, $3 on the 3 and so on to $10 on the 10.

Each player has a personalized token. They take turns — Ariel first, Beatrice second and Cassandra third — placing their tokens on one of the money stacks (only one token is allowed per space). Once the tokens are all placed, each player gets to take every stack that her token is on or is closest to. If a stack is midway between two tokens, the players split that cash.

How will this game play out? How much is it worth to go first?

To solve this we have to assume each player is a perfect logician – then we work in reverse: say we know exactly where Ariel and Beatrice have placed their token. If this is the case we can find the optimal place for Cassandra to put her token to maximize her earnings. Using this we can back out the optimal place for Beatrice to put her token after each of Allice’s moves — it would be where Cassandra’s best move is the worst. Using this we can back out Allice’s best move — its where even if Beatrice and Cassandra use their best moves its the best for Ariel. Here’s an example:


#Of the form: 

# [[A's token place (0 indexed), B's token place, C's token place], [A's winnings, B's winnings, C's winnings]]

[[1, 5, 2], [1, 49, 5]]
[[1, 5, 4], [2.0, 47.0, 6.0]]
[[1, 5, 5], [3, 45, 7]]
[[1, 5, 6], [4.5, 10.5, 40]]
[[1, 5, 7], [4.5, 13.5, 37.0]]
[[1, 5, 8], [4.5, 16.5, 34]]
[[1, 5, 9], [4.5, 20.0, 30.5]]
[[1, 5, 10], [4.5, 23.5, 27]]

From this we can figure out if A places token on 1 and B places on 5 then C will always place on 6. Here is an example for B’s decision:


#Of the form:

# [[token places (1 indexed)], [winnings]]

[[1, 2, 3], [1, 2, 52]]
[[1, 3, 4], [2.0, 4.0, 49]]
[[1, 4, 5], [3, 7, 45]]
[[1, 5, 6], [4.5, 10.5, 40]]
[[1, 6, 7], [6, 15, 34]]
[[1, 7, 8], [8.0, 20.0, 27]]
[[1, 8, 7], [8.0, 27, 20.0]]
[[1, 9, 8], [10, 19, 26]]
[[1, 10, 9], [12.5, 10, 32.5]]

Here we determine that if A places on 1, then it is in B’s best interest to place on 8. We do this a final time and determine the players will play:

#Of the form:
# [[token places (1 indexed)], [winnings]]

[[5, 9, 8], [21, 19, 15]]

And this is the answer to our problem.

Matching Game

Another week, another riddler. I really liked this one, if a bit clear cut. The problem:

I have a matching game app for my 4-year-old daughter. There are 10 different pairs of cards, each pair depicting the same animal. That makes 20 cards total, all arrayed face down. The goal is to match all the pairs. When you flip two cards up, if they match, they stay up, decreasing the number of unmatched cards and rewarding you with the corresponding animal sound. If they don’t match, they both flip back down. (Essentially like Concentration.) However, my 1-year-old son also likes to play the game, exclusively for its animal sounds. He has no ability to match cards intentionally — it’s all random.

If he flips a pair of cards every second and it takes another second for them to either flip back over or to make the “matching” sound, how long should my daughter expect to have to wait before he finishes the game and it’s her turn again?

To solve this we can look at each “level” independently, where a level is the number of pairs of cards remaining. We start at level 10 and our goal is to get to level 0. To figure out the estimated amount of time on, for example, level 10 we use the equation:

t_{10} = \frac{1}{19}(2) + \frac{18}{19}(2+t_{10})

This says the first card selection doesn’t matter – but after selecting it we have to select the second card and there is only one pair. This equation shows that there is a 1/19 chance we correctly select the card and proceed, while costing two seconds. There is also a 18/19 chance we fail, which costs us two seconds plus the time to successfully complete the level. Solving this gives us:

t_{10} = 38

Extending this we can see that solving for a generalized level is:

t_{n} = \frac{1}{2n-1}(2) + \frac{2n-2}{2n-1}(2+t_{n})

Summing levels 1-10 gives us 200 seconds. This was confirmed by brute forcing it in python.

import random, math

def run_trial(card_pairs): 
  time = 0
card_set = [math.floor(i/2) for i in range(card_pairs*2)]
while len(card_set):
    selection_index = [i for i in range(len(card_set))]
    random.shuffle(selection_index)
    if card_set[selection_index[0]] == card_set[selection_index[1]]:
      value = card_set[selection_index[0]]
      card_set.remove(value)
card_set.remove(value)
time += 2
return time

sum_time = 0
trials = 10000
for i in range(trials): sum_time += run_trial(10)
print("Average time: " + str(sum_time/trials))