Pannin for Pangrams Pt II

Four years ago I wrote part one of Panning for Pangrams. I looked through about a million tweets looking for a new pangram (think “The quick brown fox jumps over the lazy dog”). My goal was to find out if someone has accidentally tweeted a more efficient pangram — basically an applied infinite monkey theory.

If you didn’t read the initial post don’t worry, here are the rules:

1. The tweet cannot include a username (@BuildABarr for example) or url (http://anything.com)
2. The tweet gets 9 points for each unique alphabetical character. This incentives including more of the alphabet
3. The tweet gets -1 point for each repeated character or non-whitespace character. This means numbers or symbols hurt the score. This incentives coherence
4. I get veto power. I really only used this on people who tweeted the entire alphabet because I guess thats a thing

I then searched 9m tweets from this database: Below are some tweets that score well:

with 24 characters for 188 points

with 24 characters for 193 points

with 24 characters for 195 points

So there is lots of room for improvement!

Code here:

<pre>import pandas as pd
import matplotlib.pyplot as plt
import os, time, re

path = os.path.normpath("D:/test_set_tweets.txt")

def run_data_logger():
max_points = 0

with open(path, "r", errors='ignore') as fp: #strips away non-utc chars
for line in fp:

line_data = line.split("\t")
if len(line_data) &gt; 2:
text = line_data[2]
tweet = text

if text.find("http:") is not -1 or text.find("@") is not -1: text = ""
text = text.upper() #uppercase all to avoid doubling up

text = re.sub('[ ]', "", text) #destroy spaces

char_count = len(text)

text = ''.join(set(text)) #remove repeated letters
text = re.sub('[^A-Za-z]', "", text) #delete non-characters

points = 9*len(text) -1*(char_count-len(text))

if points &gt; 185 and len(text) &gt;= 23:
print("Points: ",points," chars: ", len(text)," Max tweet: ",tweet)

run_data_logger()</pre>


Sum to 15 Card Game

Another week, another riddler. This week’s:

From John Hanna, a different kind of “card” game:

You and I are playing a game. It’s a simple one: Spread out on a table in front of us, face up, are nine index cards with the numbers 1 through 9 on them. We take turns picking up cards and putting them in our hands. There is no discarding.

The game ends in one of two ways. If we run out of cards to pick up, the game is a draw. But if one player has a set of three cards in his or her hand that add up to exactly 15 before we run out of cards, that player wins. (For example, if you had 2, 4, 6 and 7, you would win with the 2, 6 and 7. However, if you had 1, 2, 3, 7 and 8, you haven’t won because no set of three cards adds up to 15.)

Let’s say you go first. With perfect play, who wins and why?

This one is pretty fun. Here is how I looked at it:

Imagining that each player can see into the future, they will take the option that maximizes their outcome. Therefore my plan was to find the end result of every hand (9! or ~400,000 outcomes). Then, go up a level — when the player had to decide between the two outcomes they would pick the one that gives them the best result. You do this again and again, flipping who gets to choose between the options and therefore which direction they are optimizing for.

At the end of this we see that the players will result in a tie every time.

DIY Hangboard

Being a renter I really wanted a way to work on bouldering while at home. In order to work on both pull-up and finger strength the internet suggested getting a hangboard — however they didn’t look like they would fit the geometry of my odd door or required big bolts. As a result I had to go with the DIY option

The final product

I didn’t document it well but a rough parts list:

Here is an image from the other side

Recently on FiveThirtyEight they have used “radar charts” quite a bit. Frequently used in video games to show multidimensional data while being visually striking.

While these are cool what I found myself doing was trying to compare the relative areas. While this isn’t incorrect it can be deceiving — the order and therefore shape they make can skew the amount of area they take up. To prove this I recreated them in Python:

It is interesting that the area can grow by ~20% just by reordering the attributes.

Maybe not revolutionary but something to consider when making or viewing these graphs

Creating Crossword Grids

Another week, another riddler. This week’s puzzle:

Crossword puzzle grids typically obey a few rules and conventions.

1. They are 15-by-15.
2. They are rotationally symmetric — that is, if you turn the grid upside down it appears exactly the same.
3. All the words — that is, all the horizontal and vertical sequences of white squares — must be at least three letters long. All the letters must appear in an “across” word and a “down” word.
4. The grid must be entirely connected — that is, there can be no “islands” of white squares separated from the rest by black squares.

First question: How many such crossword grids are there?

Second question: Crossword constructors do well to avoid using “cheater squares,” black squares whose addition makes some words shorter but does not change the puzzle’s total word count. How many grids are there without cheater squares?

Extra credit: The Sunday “New York Times” puzzle is 21-by-21. How many of those are there, with and without cheater squares?

So in short we didn’t make all of the grids — this turned out to be very difficult and I am very interested to see the approach that others used to tackle this problem. However we did make something that generated around a thousand of them

Code can be found here

Elf Playlist

Another week, another riddler. Here is this week’s problem:

In Santa’s workshop, elves make toys during a shift each day. On the overhead radio, Christmas music plays, with a program randomly selecting songs from a large playlist.

During any given shift, the elves hear 100 songs. A cranky elf named Cranky has taken to throwing snowballs at everyone if he hears the same song twice. This has happened during about half of the shifts. One day, a mathematically inclined elf named Mathy tires of Cranky’s sodden outbursts. So Mathy decides to use what he knows to figure out how large Santa’s playlist actually is.

Help Mathy out: How large is Santa’s playlist?

So to do this we used minimized N for the following

$P_{N} = \prod_{i=1}^{100}(1-\frac{i}{N})\approx0.5$

Each term represents the chance that the song is not a repeat — so the first song has a 0/N chance, so the chance that it is not that song is 1. This gets multiplied by the (1-1/N) for the second song, etc. etc.

This results in 7176 when our shift length is 100. Next, we vary the shift length and find the resulting playlist length:

This, interestingly, follows the following equation for a playlist length of l and playlist length of N:

$N = 0.7214l^2 - 1.8316l+2.6343$

Now we know why the shift length is 100, its likely the total number of Christmas songs 🙂

import matplotlib.pyplot as plt

def num_songs_in_playlist(goal_prob,shift_len_in_songs):
prob = 0
songs_in_playlist = 200
while prob < goal_prob:
prob = 1
for i in range(shift_len_in_songs): prob = (1-(i/songs_in_playlist)) * prob
songs_in_playlist += 1
return songs_in_playlist

x = [i for i in range(50,200)]
y = [num_songs_in_playlist(0.5,i) for i in range(50,200)]

plt.xlabel("Shift Length (songs)")
plt.ylabel("Playlist Length")
plt.scatter(x,y)
plt.title("Santa's playlist length to ensure repeating\n probability 0.5 over various shift lengths")


imagine a hockey game

Okay okay this is a fun problem. To do this we have to look at team B’s expected goals.

The chance that team B gets 0 points is

$P_{0} = (1-0.055)^{20} = 0.3226$

The chance team B gets 1 point is:

$P_{1} = 20*(0.055)*(1-0.055)^{19} = 0.3795$

and the chance team B gets 2+ points is:

$P_{2+} = 1 - P_{0} - P_{1} = 0.3019$

While the difference isn’t large, it holds that team A is expected to win. Counterintuitive!