Open Source
I build stuff because it makes me better and it's really fun.
I tend to build things that are useful for me, and hopefully, they will be useful for others.
If you find something interesting or want to build something together, Email Me.
ML paper reconstructions
I reconstruct papers and popular models from scratch in either PyTorch or JAX.
MlxCLI
CLI functions to run LLMs from terminal.
Run large ML models optimized for Apple Silicon directly from your terminal.
Links:
GitHubScrape Wikipedia
Concurrently scrape wikipedia and tokenize the outputs.
An open-source Wikipedia scraper using GoLang (for concurrency), and then tokenize those outputs using openai's tiktoken tokenizer.
Links:
GitHubAugment ML
Multi-modal labeling and RLHF tagging.
An open-source infrastructure for labeling multimodal data while enabling RLHF tagging and augmenting your existing training data at no cost.
Links:
WebsiteTinygrad Docs
Wrote docs and examples for Tinygrad.
I wrote and tested an example of every function in the Tensor and NN libraries for Tinygrad. I wrote all of the docs and examples on how to use the library on MNIST and more. Tinygrad is a ML framework that is focused on making it really easy to build a model. It's also there to provide optimizations for inference but those come with some tradeoffs.
Little book of DL
Summarizing all of high level Deep Learning.
I wrote a summary of all of the high level concepts of deep learning. I think this is really important because it gives a fundamental, first-principle understanding of everything that is going on in Deep Learning.
Links:
GitHubNPM Library AptosJS
NodeJS library to interact with the Aptos Blockchain.
Created a NodeJS library that provides react webhooks to interact with the Aptos blockchain. I created this because I want to remove all complexity when interacting with blockchains. I believe the only reason they are not widely adopted is because of the barrier to entry in the form of complexity.
CambrianML
Web app to interact with arXiv better.
Cambrian is a web application that allows you to get the most out of any arXiv paper. It allows you to chat with papers, search for papers, send them to friends, and share your papers publically.
Links:
WebsiteFine-tuning LLaMA with LoRA
I fine-tuned LLaMA with LoRA on a sentiment task.
I took the contents of a HuggingFace Twitter sentiment dataset and fine-tuned LLaMA with LoRA on it. I use LoRA to fine-tune the model for training efficiency.
Links:
GitHubDeep NN in NumPy
I train a deep neural network in NumPy.
Training a neural network using PyTorch is easy, all of the primitives are given to you, but what if you had to implement it from scratch? I did just that, I implemented a deep neural network in NumPy.
Links:
GitHubEssay Embedding Search
Search through Paul Graham Essays using embeddings.
I created a Python script that allows you to search for any phrase in Paul Graham's Essays. It uses embeddings to find the most similar essay to your query. I use ChromaDB as the embedding database.
Links:
GitHubCollaborative Text Editor
Work with other people in a single text editor.
I created a collaborative text editor using NextJS and SocketIO. It allows you to work with other people in a single text editor. I built this to further my understanding of sockets in a real-time application. Worked many optimizations as well.
Links:
GitHubCybersyn Data Visualization
I created a BI tool for data visualization.
I built this BI tool to show the power of the Snowflake Data Marketplace. It allows you to go grab any dataset and manipulate it such that you can build your app on top of it. It is similar to the AppStore but for data. Also, Cybersyn is cool.
AI Blog Generator
Web application to generate SEO blog articles
I created this because I wanted to see if LLM can create SEO-optimized blog articles. The theory behind it is if someone is starting a new startup, they can rank for keywords with low KD on Google and basically get cheap traffic.
Reverse Video Playback
Render a video in reverse.
I created this web application to render a video in reverse. I used FFMPEG & NextJS to do this. Moreover, this is done in serverless functions on Vercel.
Links:
GitHubNotion to Blog
Notion as a CMS
I wanted to see if I can use Notion as a CMS. This way all changes to a potential blog for a web application would update in minutes on the site.
Links:
GitHubRandomization Experiment Interface
Control trial randomization
A script to perform randomization for experiments. It randomly assigns different testing groups and calculates regression p-values as an interface.
Links:
GitHub