An empirical exploration of why arithmetic models sometimes neither memorize nor generalize (Part 2 of an ongoing exploration into how small language models learn arithmetic) In the previous post, I described an experiment that started with a simple goal:to observe memorization and eventual grokking in a small language model trained on arithmetic operations. The setupContinueContinue reading “Between Memorization and Meaning: When Neural Networks Learn, But Not the Way We Expect”
Author Archives: Sam Banerjee
Why arithmetic models look dumb long after they’ve learned the rule
An experiment in memorization, grokking, and misleading loss curves This post documents an experiment that didn’t go the way I expected.What started as a simple attempt to observe memorization and grokking in arithmetic models turned into a deeper lesson about how misleading loss curves can be — especially for algorithmic tasks. What I expected toContinueContinue reading “Why arithmetic models look dumb long after they’ve learned the rule”
Digital payments and cashless society
We, the social beings of the modern society have a tendency of taking things for granted. Like most things, we have taken technological advancements in the society for granted. Without spending enough time on understanding the long term consequences of these advancements or the philosophies underpinning the evolution. Fernando Montoro, a friend, philosopher and guide,ContinueContinue reading “Digital payments and cashless society”
Online dating – flame of lust or love
If you were born before the 90’s, chances are that you met your partner in high school, a coffee shop, through a friend or on a trip. But gone are those days when you had to gain enough courage to present yourself in front of your crush, when you had to meet them in aContinueContinue reading “Online dating – flame of lust or love”
Trust – a social fabric
Preface – Smart contracts and decentralization One of the fundamental ideologies of a blockchain ecosystem is “trust”, rather “trustless”(ness). With smart contracts, a block of code would have the ability to run automatically setting the terms of a “contract” as per protocol, and would immutably persisted in the blockchain. In this post, I am goingContinueContinue reading “Trust – a social fabric”
Get paid for paying attention
How many times have you been irritated by spam emails and random ad campaign in your inbox when you open your email? Almost often, right? Or think of a time when you are consumed in a piece of content on a website and you get interrupted by an ad banner selling you things that youContinueContinue reading “Get paid for paying attention”
Barter to Bitcoin – an evolution story
What is money? At hind sight, it’s a piece of paper that can buy you anything from a cup of coffee to a house. But why is money “money”? In other words, why money has to be a unit of transaction? Why does it have a value? Seriously, what’s stopping us from printing shit tonContinueContinue reading “Barter to Bitcoin – an evolution story”
The Reactive Future
Human nature has been reactive since the beginning of time. If there is there is a need to travel farther and without necessarily indulging into too much inconvenience, we invent machines (read steam engine, cars, trains and the likes). When there isn’t enough land space for new construction, we cut forests. When we needed certainContinueContinue reading “The Reactive Future”
Wisdom in a box
Welcome to the future.Think. Apply. Transform. © 2021 Wisdombox Wisdombox a.k.a wsdmbox is a platform to voice opinion, encourage thinking by creating meaningful contents, asking questions and sharing experiences to raise our collective wisdom that would create awareness, have a long term positive impact on the society and encourage us to take necessary action towardsContinueContinue reading “Wisdom in a box”
Extract-Transform-Load fast with Golang and Postgres
The PitchSo, I have joined a travel startup recently, and since my first day here, I have been focusing on organizing the humongous data from different sources that fuel the business, using data pipelines and sharding strategies. The ProblemA part of the problem was to solve this huge data import problem. All we needed wasContinueContinue reading “Extract-Transform-Load fast with Golang and Postgres”