- machine-learning-zoomcamp
- zoomcamp-analytics
- data-engineering-zoomcamp
- mlops-zoomcamp
- MLOps project with Loan Data
- Large Language Models Course
- Google-Machine-Learning-for-Solutions-Architects
- A collection of design patterns/idioms in Python
- How to add tests to your data pipelines
- Data Pipeline Design Patterns - #1. Data flow patterns
- Data Pipeline Design Patterns - #2. Coding patterns in Python
- energy-forecasting
- How to build and automate a python ETL pipeline with airflow on AWS EC2
Good Data related papers - Link
Neatly organized here: Link
Good Data Engineering Papers
Updated list of engineering papers worth reading.
1. Google File System - https://lnkd.in/d2-wnyqZ
2. Map Reduce Big Data Algorithm - https://lnkd.in/dvE8-s8M
3. BigTable NoSQL Document Store - https://lnkd.in/drmvvSAK
4. Colossus Next Gen File Store - https://lnkd.in/dERKhwMf
5. Megastore Large Object Store - https://lnkd.in/d5JDs2-K
6. Monarch Time Series DB - https://lnkd.in/d3kH_NCp
7. Chubby Distributed Lock Management - https://lnkd.in/dYy-w5rW
8. Spanner Distributed Database - https://lnkd.in/d6Emnycp
9. Spanner - CAP theorem considerations - https://lnkd.in/dq29BAWQ
10. Dapper Tracing System - https://lnkd.in/dm36-6jn
11. Borg Cluster Management - https://lnkd.in/dnveV-HU
12. Zanzibar Authentication System - https://lnkd.in/d5Vf7sRD
13. Pregel Graph Processing - https://lnkd.in/daq4576Y
14. Napa - Data Warehousing - https://lnkd.in/dbEfsa5B
15. Napa - Partitioning Algorithm - https://lnkd.in/dkhA7efJ
16. TensorFlow - Machine Learning at Scale - https://lnkd.in/d-4NfV2Z
17. Google F1 - Fast Analytics - https://lnkd.in/dbZqEKuf
18. HALP - YouTube Content Delivery Network - https://lnkd.in/dHzJtUc7
19. Mesa - Data Warehousing - https://lnkd.in/dFJ_Jrz6
20. Google Firestore - https://lnkd.in/drtEN9qR
21. Amazon Aurora DB Architecture - https://lnkd.in/dcevpwFt
22. Dynamo DB NoSQL Database - https://lnkd.in/dMD8C_WK
23. Apple Foundation DB - NewSQL database - https://lnkd.in/dG75i_9K
24. TikTok Monolith - Embedding in real-time - https://lnkd.in/dcjBXCnc
25. Scalability at what COST - https://lnkd.in/dJ9ScYKq
26. Gorilla - Time Series DB - https://lnkd.in/d3AeN2kB
27. Cassandra - NoSQL DB - https://lnkd.in/d-_nhtED
28. FlexiRaft - Distributed Consensus Tradeoffs - https://lnkd.in/dX3nMvmt
29. Memcache - In-memory Cache at Facebook - https://lnkd.in/dKeYK67g
30. Millisampler Network Sampling - https://lnkd.in/dsj9FuD6
31. TAO Graph Database - https://lnkd.in/daasJpYf
32. MineSweeper - Root Cause Analysis - https://lnkd.in/dEsd6iwj
33. Facebook Prophet - Forecasting at Scale - https://lnkd.in/daCmAjak
34. Facebook ShardManager - https://lnkd.in/dDy9Dp2h
35. Hive - Map Reduce Jobs - https://lnkd.in/dpV8BM2R
36. Apache Thrift - Definition Language - https://lnkd.in/d7NzhP54
37. Meta Twine - Cluster Management System - https://lnkd.in/d5t7VFKE
38. Meta ServiceRouter - Service mesh - https://lnkd.in/dVnkv_bV
39. Apache Hadoop - Distributed File System - https://lnkd.in/dHsQu9FN
40. Apache Kafka - Event Bus - https://lnkd.in/dyxuKbMb
41. Apache Flink - https://lnkd.in/dn_gMvaR
NLP
Vision Notes
It takes time to read - learn - experiment - build expertise. Learning Never Ends !!!!
Keep Exploring!!!
No comments:
Post a Comment