Show HN: What happens if you make a crossword out of Reddit r/gaming
The "Joystick Jargon" project merges traditional crosswords with gaming vocabulary using a Reddit dataset, machine learning for keyword extraction, and invites community feedback on its innovative approach.
The project "Joystick Jargon" involves creating crosswords that integrate traditional elements with gaming vocabulary. The creator utilized a dataset from Reddit, specifically focusing on gaming-related subreddits such as r/gaming, r/dota2, and r/leagueoflegends. The process began with data filtering to extract relevant content, followed by keyword extraction using machine learning techniques like BERT-embeddings and cosine similarity. The data was then preprocessed to remove unsuitable entries for crossword puzzles. A heuristic algorithm was employed to generate grids and efficiently place words, while a Large Language Model was used to create context-aware clues for the words. The final product features a balanced mix of traditional crossword elements and gaming terminology, although it may contain mature language due to the nature of the Reddit data. The creator acknowledges that the project may be overengineered but views it as a valuable exploration of natural language processing and optimization algorithms in the context of gaming culture. The creator invites feedback from the community on this approach and suggests that similar computational techniques could be applied to other domains for content creation.
- The project combines traditional crossword puzzles with gaming vocabulary.
- It utilizes a large Reddit dataset filtered for gaming-related content.
- Machine learning techniques were employed for keyword extraction and clue generation.
- The final puzzles maintain a balance of traditional and gaming elements.
- The creator seeks community feedback on the approach and potential applications in other domains.
Related
Solving puzzles faster than humanly possible
The Opus Magnum challenge tasks players with automating puzzle-solving to optimize Cost, Cycles, and Area metrics. Participants submit solutions for evaluation, exploring automated vs. human strategies, hybrid approaches, scoring systems, mods, and bots.
I read the dictionary to make a better game (2023)
The development of the word search game Tauggle focuses on achieving 100% completion on each board by curating a dictionary with common words. Balancing inclusivity and exclusivity enhances player satisfaction.
Regex Crossword
A regex crossword puzzle challenges participants to fill a grid using simplified regex syntax. The author encourages learning regex basics and promotes their book on enhancing Python skills.
Waking up to NYT Crosswords on reMarkable paper tablet
The author integrated New York Times Crossword puzzles with their reMarkable tablet, developing an automated method for downloading and uploading puzzles, while highlighting issues of data ownership and accessibility in digital services.
A Vector Database Plays Mario Kart 64
Qdrant Kart is an innovative application that enhances Mario Kart 64 using a Vector Database. The article details its architecture, data collection, embedding generation, and emulator integration, with a video demonstration included.
I think it might be worth working on prompting to make sure the answer is a unique solution to the hint (or at least closer to unique). What model are you using here?
There's an outstanding issue and that is (from what I can tell) at least 75% of the answers correspond to relatively generic nouns or verbs.
Part of the deep satisfaction in solving a crossword puzzle is the specificity of the answer. It's far more gratifying to answer a question with something like "Hawking" then to answer with "scientist", or answering with "mandelbrot" versus "shape".
It might be worth going back and looking up a compendium of games released in the last couple decades, cross referencing them with their manuals, GameFaqs, etc. and peppering this information into the crossword.
However, this is well done and it inspired a thought- I wonder if it would be possible to procedurally generate word games, such as a mini crossword or word ladder or so on, as part of a language learning regime? Think Duolingo but for word puzzle fans.
As an example, you solve a mini crossword every day where 80% of the clues/answers are in English, and 20% are drawn from a progressive set of vocabulary in the other language.
After a ~30 hours weekend coding marathon, I've just pushed a new version of the original joystick-jargon (r/gaming) and a new r/leagueoflegends puzzle live.
https://capsloq.de/crosswords/joystick-jargon
https://capsloq.de/crosswords/r/leagueoflegends
What changed?
- 5 new puzzles for r/gaming
- 6 new puzzles for r/leagueoflegends
- Old puzzles deleted
- New extraction algorithm (everything new: tokenizer, transformers, piplines, model, word and document embeddings, scoring, complete overhaul ...)
- New clue prompting
- Grid can now only contain diagonal black boxes (should guarantee intersections)
- Fixed numbering bug on the grid
- Did proof read each puzzle and some slight adjustments to guarantee puzzle integrity.
Warning: When i did proof read the League of Legends Q&A I noticed that I've never played that game so I couldn't verify everything!
Thank you very much to everyone who provided feedback to improve on v1.
I really hope you feel an increase in quality. I am looking forward for even more feedback and improving further.
Planning to use more suitable datasets in the future. It's super hard to get quality crossword list out of r/gaming.
Have fun puzzling! (please)
> Bar
Very wrong, unless we’re simulating what someone unfamiliar with the gaming jargon would think - in which case very accurate.
The way the NYT does this on their web interface is nice. They have the puzzle in one column, the across clues in a second column, and the down clues in a third column. The clue columns each are scrollable.
It automatically scrolls to keep the clue for whatever word you have selected in view and highlights that clue, and also automatically scrolls to keep the clue for whatever word crosses that word at the particular square you have highlighted is also visible and marked in the margin of its clue list.
They do similar in their iPad app, but also below the puzzle show the clue for the selected word and for whatever word crosses it at the highlighted square. With that you can concentrate on the grid and a fixed clue area.
> Unportal
what?
>19: An online forum section where gamers come together to discuss specific topics
> ITT
I don't think that's right...
Most of the questions just seem like normal crossword questions, but with the term "in games" added to it.
I'm not gonna sugarcoat it: this sucks. The crossword grids often have totally isolated words. 1 across and 2 down start with the same letter. The questions are nonsensical. I'd hardly go so far as to call it an interesting proof of concept.
Sadly, the clues and the words relating to them feel off, making the whole game rather unenjoyable.
I always think about doing something similar for a similar project. Are you able to do it completely automatically or do you have to help finesse the words to fit?
Related
Solving puzzles faster than humanly possible
The Opus Magnum challenge tasks players with automating puzzle-solving to optimize Cost, Cycles, and Area metrics. Participants submit solutions for evaluation, exploring automated vs. human strategies, hybrid approaches, scoring systems, mods, and bots.
I read the dictionary to make a better game (2023)
The development of the word search game Tauggle focuses on achieving 100% completion on each board by curating a dictionary with common words. Balancing inclusivity and exclusivity enhances player satisfaction.
Regex Crossword
A regex crossword puzzle challenges participants to fill a grid using simplified regex syntax. The author encourages learning regex basics and promotes their book on enhancing Python skills.
Waking up to NYT Crosswords on reMarkable paper tablet
The author integrated New York Times Crossword puzzles with their reMarkable tablet, developing an automated method for downloading and uploading puzzles, while highlighting issues of data ownership and accessibility in digital services.
A Vector Database Plays Mario Kart 64
Qdrant Kart is an innovative application that enhances Mario Kart 64 using a Vector Database. The article details its architecture, data collection, embedding generation, and emulator integration, with a video demonstration included.