October 18th, 2024

Show HN: What happens if you make a crossword out of Reddit r/gaming

The "Joystick Jargon" project merges traditional crosswords with gaming vocabulary using a Reddit dataset, machine learning for keyword extraction, and invites community feedback on its innovative approach.

Show HN: What happens if you make a crossword out of Reddit r/gaming

The project "Joystick Jargon" involves creating crosswords that integrate traditional elements with gaming vocabulary. The creator utilized a dataset from Reddit, specifically focusing on gaming-related subreddits such as r/gaming, r/dota2, and r/leagueoflegends. The process began with data filtering to extract relevant content, followed by keyword extraction using machine learning techniques like BERT-embeddings and cosine similarity. The data was then preprocessed to remove unsuitable entries for crossword puzzles. A heuristic algorithm was employed to generate grids and efficiently place words, while a Large Language Model was used to create context-aware clues for the words. The final product features a balanced mix of traditional crossword elements and gaming terminology, although it may contain mature language due to the nature of the Reddit data. The creator acknowledges that the project may be overengineered but views it as a valuable exploration of natural language processing and optimization algorithms in the context of gaming culture. The creator invites feedback from the community on this approach and suggests that similar computational techniques could be applied to other domains for content creation.

- The project combines traditional crossword puzzles with gaming vocabulary.

- It utilizes a large Reddit dataset filtered for gaming-related content.

- Machine learning techniques were employed for keyword extraction and clue generation.

- The final puzzles maintain a balance of traditional and gaming elements.

- The creator seeks community feedback on the approach and potential applications in other domains.

Link Icon 21 comments
By @maxrmk - 6 months
Tried it! I really like the idea, but I think the clue generation could use some work. Every clue ended in "in games", and honestly most of them were not really game related to start with. For example the clue "Place in games where characters go to rest and replenish health or mana" had the solution "bar"... which I wouldn't describe as right. Similarly "The name of a popular character who may need rescuing in some games" was "Emily".

I think it might be worth working on prompting to make sure the answer is a unique solution to the hint (or at least closer to unique). What model are you using here?

By @vunderba - 6 months
Nice work. I've also experimented with procedurally generated crossword puzzles, though I really wanted to constrain them to symmetric layouts like what you would find in the New York times which made it more difficult.

There's an outstanding issue and that is (from what I can tell) at least 75% of the answers correspond to relatively generic nouns or verbs.

Part of the deep satisfaction in solving a crossword puzzle is the specificity of the answer. It's far more gratifying to answer a question with something like "Hawking" then to answer with "scientist", or answering with "mandelbrot" versus "shape".

It might be worth going back and looking up a compendium of games released in the last couple decades, cross referencing them with their manuals, GameFaqs, etc. and peppering this information into the crossword.

By @darepublic - 6 months
In first puzzle I received, there was a clue for 1 across but there was no such place on the board. It was actually 2 across
By @zeugmata9 - 6 months
Echoing what vunderba said, it's hard for me to enjoy this because I'm used to the symmetric layouts of e.g. the NYT and the satisfying flavor of their clues.

However, this is well done and it inspired a thought- I wonder if it would be possible to procedurally generate word games, such as a mini crossword or word ladder or so on, as part of a language learning regime? Think Duolingo but for word puzzle fans.

As an example, you solve a mini crossword every day where 80% of the clues/answers are in English, and 20% are drawn from a progressive set of vocabulary in the other language.

By @zomh - 6 months
~~~ UPDATE ~~~~

After a ~30 hours weekend coding marathon, I've just pushed a new version of the original joystick-jargon (r/gaming) and a new r/leagueoflegends puzzle live.

https://capsloq.de/crosswords/joystick-jargon

https://capsloq.de/crosswords/r/leagueoflegends

What changed?

- 5 new puzzles for r/gaming

- 6 new puzzles for r/leagueoflegends

- Old puzzles deleted

- New extraction algorithm (everything new: tokenizer, transformers, piplines, model, word and document embeddings, scoring, complete overhaul ...)

- New clue prompting

- Grid can now only contain diagonal black boxes (should guarantee intersections)

- Fixed numbering bug on the grid

- Did proof read each puzzle and some slight adjustments to guarantee puzzle integrity.

Warning: When i did proof read the League of Legends Q&A I noticed that I've never played that game so I couldn't verify everything!

Thank you very much to everyone who provided feedback to improve on v1.

I really hope you feel an increase in quality. I am looking forward for even more feedback and improving further.

Planning to use more suitable datasets in the future. It's super hard to get quality crossword list out of r/gaming.

Have fun puzzling! (please)

By @thih9 - 6 months
> Place in games where characters go to rest and replenish health or mana

> Bar

Very wrong, unless we’re simulating what someone unfamiliar with the gaming jargon would think - in which case very accurate.

By @tzs - 6 months
I had to reduce the size in my browser a couple or so times to see both the puzzle and all the clues at the same time. It might be better to have the clues in a separate scrolling region on the page.

The way the NYT does this on their web interface is nice. They have the puzzle in one column, the across clues in a second column, and the down clues in a third column. The clue columns each are scrollable.

It automatically scrolls to keep the clue for whatever word you have selected in view and highlights that clue, and also automatically scrolls to keep the clue for whatever word crosses that word at the particular square you have highlighted is also visible and marked in the margin of its clue list.

They do similar in their iPad app, but also below the puzzle show the clue for the selected word and for whatever word crosses it at the highlighted square. With that you can concentrate on the grid and a fixed clue area.

By @mmastrac - 6 months
The 13th puzzle is probably the first decent one. I still can't quite put together how it's getting the clues in some cases (BMing = using pedals? noly = Just this, no other?) but it's certainly the first one that's within reach of being reasonable.
By @pimlottc - 6 months
Please respect your users. It’s frustrating to spend your time on a puzzle that might not even be plausibly solveable. If you’re going to ask strangers to do free testing you should at least bother to do some basic proofreading first.
By @polivier - 6 months
I love crossword puzzles, and this is very well done! A while back I made a crossword puzzle generator using constraint programming. It is definitely way slower than whatever heuristic you are using, but it is very good at generating dense crossword puzzles using small word lists. This would be useful if you were interested in making a crossword puzzle using words related to a single game, for instance. You can read more on it here if you are interested: https://pedtsr.ca/2023/generating-crossword-grids-using-cons...
By @costco - 6 months
By @dmonitor - 6 months
> 1: Allows movement between different areas in a game

> Unportal

what?

>19: An online forum section where gamers come together to discuss specific topics

> ITT

I don't think that's right...

Most of the questions just seem like normal crossword questions, but with the term "in games" added to it.

I'm not gonna sugarcoat it: this sucks. The crossword grids often have totally isolated words. 1 across and 2 down start with the same letter. The questions are nonsensical. I'd hardly go so far as to call it an interesting proof of concept.

By @bryanhogan - 6 months
Tried it and think your approach is really cool.

Sadly, the clues and the words relating to them feel off, making the whole game rather unenjoyable.

By @Suppafly - 6 months
>5. Grid Generation: Implemented a heuristic crossword algorithm to create grids and place words efficiently.

I always think about doing something similar for a similar project. Are you able to do it completely automatically or do you have to help finesse the words to fit?

By @dfc - 6 months
Why is there no 1 across on the puzzle? There are clues for 1 across but not on the "board".
By @joshdavham - 6 months
Awesome job! Also, might I suggest a bit of i18n? I can't read German..
By @bbstats - 6 months
First once I loaded, 1 across was labeled 2
By @mvdtnz - 6 months
Unportal? What the fuck is unportal? I have played games for 30 years and I've never heard the term "unportal". Google gives no useful results. That clue made me angrier than any crossword clue I've ever seen.