July 1st, 2024

My Python code is a neural network

Neural networks are explored for identifying program code in engineering messages. Manual rules and a Python classifier are discussed, with a suggestion to use a recurrent neural network for automated detection.

Read original articleLink Icon
My Python code is a neural network

This article discusses the use of neural networks to detect program code in messages, particularly focusing on identifying references to program code in engineering communications. The author presents various decision rules and a hand-written algorithm to distinguish code from regular text, highlighting the challenges of manual rule creation due to false positives and negatives. The article then introduces a Python classifier based on Rule 1, which achieves 100% precision but only 50% recall. To address the limitations of manual rule creation, the author suggests training a recurrent neural network (RNN) to automate the detection process. The RNN is described as a state machine that processes token sequences to determine if a message contains code, offering a more flexible and scalable approach compared to hand-crafted rules. The article concludes by outlining the mathematical representation of the RNN's hidden layers and the potential for improving detection accuracy through neural network training.

Related

We need an evolved robots.txt and regulations to enforce it

We need an evolved robots.txt and regulations to enforce it

In the era of AI, the robots.txt file faces limitations in guiding web crawlers. Proposals advocate for enhanced standards to regulate content indexing, caching, and language model training. Stricter enforcement, including penalties for violators like Perplexity AI, is urged to protect content creators and uphold ethical AI practices.

Francois Chollet – LLMs won't lead to AGI – $1M Prize to find solution [video]

Francois Chollet – LLMs won't lead to AGI – $1M Prize to find solution [video]

The video discusses limitations of large language models in AI, emphasizing genuine understanding and problem-solving skills. A prize incentivizes AI systems showcasing these abilities. Adaptability and knowledge acquisition are highlighted as crucial for true intelligence.

'Skeleton Key' attack unlocks the worst of AI, says Microsoft

'Skeleton Key' attack unlocks the worst of AI, says Microsoft

Microsoft warns of "Skeleton Key" attack exploiting AI models to generate harmful content. Mark Russinovich stresses the need for model-makers to address vulnerabilities. Advanced attacks like BEAST pose significant risks. Microsoft introduces AI security tools.

Programmers Should Never Trust Anyone, Not Even Themselves

Programmers Should Never Trust Anyone, Not Even Themselves

Programmers are warned to stay cautious and skeptical in software development. Abstractions simplify but can fail, requiring verification and testing to mitigate risks and improve coding reliability and skills.

Analysing 16,625 papers to figure out where AI is headed next (2019)

Analysing 16,625 papers to figure out where AI is headed next (2019)

MIT Technology Review analyzed 16,625 AI papers, noting deep learning's potential decline. Trends include shifts to machine learning, neural networks' rise, and reinforcement learning growth. AI techniques cycle, with future dominance uncertain.

Link Icon 25 comments
By @skybrian - 4 months
This article doesn't talk much about testing or getting training data. It seems like that part is key.

For code that you think you understand, it's because you've informally proven to yourself that it has some properties that generalize to all inputs. For example, a sort algorithm will sort any list, not just the ones you tested.

The thing we're uncertain about for a neural network is that we don't know how it will generalize; there are no properties that we think are guaranteed for unseen input, even if it's slightly different input. It might be because we have an ill-specified problem and we don't know how to mathematically specify what properties we want.

If you can actually specify a property well enough to write a property-based test (like QuickCheck) then you can generate large amounts of tests / training data though randomization. Start with one example of what you want, then write tests that generate every possible version of both positive and negative examples.

It's not a proof, but it's a start. At least you know what you would prove, if you could.

If you have such a thing, relying on spaghetti code or a neural network seem kind of similar? If you want another property to hold, you can write another property-based test for it. I suppose with the neural network you can train it instead of doing the edits yourself, but then again we have AI assistance for code fixes.

I think I'd still trust code more. At least you can debug it.

By @scotchmi_st - 4 months
This is an interesting article if you read it like a howto for constructing a neural network for performing a practical task. But if you take it at face-value, and follow a similar method the next time you need to parse some input, then, well, I don't know what to say really.

The author takes a hard problem (parsing arbitrary input for loosely-defined patterns), and correctly argues that this is likely to produce hard-to-read 'spaghetti' code.

They then suggest replacing that with code that is so hard to read that there is still active research into how it works, (i.e a neural net).

Don't over-index something that's inscrutable versus something that you can understand but is 'ugly'. Sometimes, _maybe_, a ML model is what you want for a task. But a lot of the time, something that you can read and see why it's doing what it's doing, even if that takes some effort, is better than something that's impossible.

By @pakl - 4 months
There exists the Universal (Function) Approximation Theorem for neural networks — which states that they can represent/encode any function to a desired level of accuracy[0].

However there does not exist a theorem stating that those approximations can be learned (or how).

[0] https://en.m.wikipedia.org/wiki/Universal_approximation_theo...

By @ryjo - 4 months
Really awesome. Thanks for this thorough write-up. I don't totally understand the deeper math concepts mentioned in this article around RNNs, but it's sparked some of my own thoughts. It feels similar to things I've been exploring lately-- that is: building your app interwoven with forward chaining algorithms. In your case, you're using RNNs, and in mine, I'm building into the Rete algorithm.

You also touch on something in this article that I've found quite powerful: putting things in terms of digesting an input string character-by-character. Then, we offload all of the reasoning logic to our algorithm. We write very thin i/o logic, and then the algorithm does the rest.

By @Fripplebubby - 4 months
Love this post! Gets into the details of what it _really_ means to take some function and turn it into an RNN, and comparing that to the "batteries included" RNNs included in PyTorch, as a learning experience.

Question:

> To model the state, we need to add three hidden layers to the network.

How did you determine that it would be three hidden layers? Is it a consequence of the particular rule you were implementing, or is that generally how many layers you would use to implement a rule of this shape (using your architecture rather than Elman's - could we use fewer layers with Elman's?)?

By @dekhn - 4 months
Are RNNs completely subsumed by transformers? IE, can I forget about learning anything about how to work with RNNs, and instead focus on transformers?
By @jlturner - 4 months
If this interests you, it’s worth taking a look at Genetic Programming. I find it to be a simpler approach at the same problem, no math required. It simply recombines programs by their AST, and given some heuristic, optimizes the program for it. The magic is in your heuristic function, where you can choose what you want to optimize for (ie. Speed, program length, minimize complex constructs or function calls, network efficiency, some combination therein, etc).

https://youtu.be/tTMpKrKkYXo

By @jpe90 - 4 months
I recently wrote a blog post exploring the idea of interfacing with local LLMs for ambiguous tasks like this. Doesn't that make more sense than coding the neural network yourself? Using something like llama.cpp and evaluating whether a small model solves your problem out of the box, and fine-tuning if not, then programmatically interfacing with llama.cpp via a wrapper of your choice seems more pragmatic to me.
By @alsxnt - 4 months
Recurrent neural networks can be used for arbitrary computations, the equivalence to Turing machines has been proven. However, they are utterly impractical for the task.

This seems to be a state machine that is somehow learned. The article could benefit from a longer synopsis and "Python" does not appear to be relevant at all. Learning real Python semantics would prove quite difficult due to the nature of the language (no standard, just do as CPython does).

By @awwaiid - 4 months
OK so first compile python to a NN. But next let's twist or overlay that onto a Transformer-based NN. Then we can have a Transformer Virtual Machine (TVM) execute arbitrary programs.

Use some of that transfer-learning (adding weights on top of each other) and an LLM can be "born" with an algorithm deeply encoded.

By @fnord77 - 4 months
> To model the state, we need to add three hidden layers to the network

Why 3?

And why use "h" for layer names?

By @lowyek - 4 months
I would love to see some work on duality i.e. code to neural network back and forth. Reason being - I can't debug a neural network but if it can be linearized into a if-else case with help of the token information => I can validate what it's doing -> fix it and then move it back to it's compressed neural representation.

Just another thought experiment -> sometimes I imagine neural networks as a zip of the training data where compression algorithm is backpropagation. Just like we have programs which let us see what files inside the zip are -> I imagine there can be programs which will let us select certain inference path of the neural net and then see what data affected that => then we edit that data to fix our issues or add more data there => and we have live neural network debugging and reprogramming in the same way we edit compressed zips

By @danans - 4 months
I'd like to see a cost vs precision/recall comparison of using a RNN vs an LLM (local or API) for a problem like this.
By @suzukigsx1100g - 4 months
That’s pretty lightwork for a snake in general. Send me a direct message if you come up with something better.
By @FeepingCreature - 4 months
Fwiw, I know LGTM as "let's get this moving" on pull requests. Seems to be contended.
By @sdwr - 4 months
This is new to me, and therefore bad and scary.

It's great that you know NN well enough to fold it into regular work. But think of all us poor regular developers! Who now have to grapple with:

- an unfamiliar architecture

- uncertainty / effectively non-deterministic results in program flow

By @29athrowaway - 4 months
Next time do one with Bayesian networks or another Probabilistic graphical model.
By @dinobones - 4 months
This article was going decently and then it just falls off a cliff.

The article basically says: 1) Here’s this complex problem 2) Here’s some hand written heuristics 3) Here’s a shitty neural net 4) Here’s another neural net with some guys last name from the PyTorch library 5) Here are the constraints with adopting neural nets

You can see why this is so unsatisfying, the leaps in logic become more and more generous.

What I would have loved to see, is a comparison of a spaghetti code implementation vs a neural net implementation on a large dataset/codebase, then show examples in the validation set that maybe the neural net generalizes to, or fails at, but the heuristic fails at, and so on.

This would demonstrate the value of neural nets, if for example, there’s a novel example that the neural net finds that the spaghetti heuristic can’t.

Show tangible results, show some comparison, show something, giving some rough numbers on the performance of each in aggregate would be really useful.

By @thih9 - 4 months
> Of course, we should try and avoid writing spaghetti code if we can. But there are problems that are so ill-specified that any serious attempt to solve them results in just that.

Can you elaborate or do you have an example?

Based on just the above, I disagree - I'd say it's the job of the programmer to make sure that the problem is well-specified and that they can write maintainable code.

By @godelski - 4 months

  > Humans are bad at managing spaghetti code. Of course, we should try and avoid writing spaghetti code if we can. But there are problems that are so ill-specified that any serious attempt to solve them results in just that.
Sounds like a skill issue.

But seriously, how many programmers do you know that reach for the documents or help pages (man pages?) instead of just looking for the first SO post with a similar question? That's how you start programming because you're just trying to figure out how to do anything in the first place, but not where you should be years later. If you've been programming in a language for years you should have read a good portion of the docs in that time (in addition to SO posts), blogs, and so much more. Because the things change too, so you have to be keeping up, and the truth is that this will never happen if you just read SO posts to answer your one question (and the next, and the next) because it will always lag behind what tools exist and even more likely will significantly lag because more recent posts have less time to gain upvotes.

It kinda reminds me of the meme "how to exit vim." And how people state that it is so hard to learn. Not only does just typing `vim` into the terminal literally tell you how to quit, but there's a built in `vimtutor` that'll tell you how to use it and doesn't take very long to use. I've seen people go through this and be better than people that have "used" vim for years. And even then, how many people write `:help someFunction` into vim itself? Because it is FAR better than googling your question and you'll actually end up learning how the whole thing fits together because it is giving you context. The same is true for literally any programming language.

You should also be writing docs to your code because if you have spaghetti code, there's a puzzle you haven't solved yet. And guess what, documenting is not too different from the rubber ducky method. Here's the procedure: write code to make shit work, write docs and edit your code as you realize you can make things better, go on and repeat but not revisit functions as you fuck them up with another function. It's not nearly as much work as it sounds and the investments compound. But quality takes time and nothing worth doing is easy. It takes time to learn any habit and skill. If you always look for the quickest solution to "just get it done" and you never come back, then you probably haven't learned anything, you've just parroted someone else. Moving fast and breaking things is great, but once you have done that you got to clean up your mess. You don't clean your kitchen by breaking your dining room table. And your house isn't clean if all your dishes are on the table! You might have to temporarily move stuff around, but eventually you need to clean shit up. And code is exactly the same way. If you regularly clean your house, it stays clean and is easy to keep clean. But if you do it once a year it is a herculean effort that you'll dread.

By @lawlessone - 4 months
Edit: ok i see it detects code.

I thought it was replacing bits of ANN with custom python functions.

By @ultra_nick - 4 months
I feel like neural networks are increasingly going to look like code.

The next big innovation will be whoever figures out how to convert MOE style models into something like function calls.