Skip to content
View alitarek-dot's full-sized avatar
  • ADCB - Egypt
  • Cairo, Egypt
  • 09:19 (UTC +02:00)
  • LinkedIn in/ali-tarek-

Block or report alitarek-dot

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
alitarek-dot/README.md

Hi 👋, I'm Ali Tarek

Data & Analytics Engineer | Scalable Pipelines | Kafka • Spark • Cloud | Active Freelancer

Connect with me:

https://www.linkedin.com/in/ali-tarek-/ https://www.kaggle.com/adude123 https://leetcode.com/Ali_Tarek/

I build data infrastructure that doesn't fall apart when your traffic spikes.

While many focus solely on writing SQL, I approach data from a System Design perspective. My goal is to bridge the gap between messy raw data and the high-performance infrastructure required to process it at scale.

Core Strengths High-Throughput Streaming: Implementing Apache Kafka and Flink for real-time data needs.

Distributed Processing: Optimizing Apache Spark (PySpark/Scala) to slash compute costs and execution windows.

Infrastructure: Skilled in managing On-Prem VMs and scaling Cloud Platforms (AWS/GCP).

Engineering Rigor: I prioritize CI/CD, data validation, and building modular, self-healing pipelines over "quick fixes."

Key Projects & Accomplishments Real-Time Analytics Engine: Architected a pipeline using Kafka and Spark Streaming that reduced data latency from hours to sub-10 seconds.

Cloud Migration: Successfully migrated a legacy on-premise data environment to a hybrid cloud setup, improving uptime by 30%.

Pipeline Optimization: Refactored a bloated ETL process in a junior role, reducing Snowflake/BigQuery credit consumption by 25% through better partitioning and logic.

Education & Continuous Learning B.S. in Computer Science / Information Technology.

Deeply committed to the Modern Data Stack—constantly testing new tools in orchestration (Dagster/Airflow) and storage (Iceberg/Delta Lake).

I’m brutally honest about what works and what doesn't. If a project doesn't need a complex Kafka setup, I’ll tell you. I’m here to build the right system, not the most expensive one.

Have a bottleneck in your data flow? Send me a message and let’s solve it.

Graduation Project

Popular repositories Loading

  1. alitarek-dot alitarek-dot Public

    1

  2. Data-Science-For-Beginners Data-Science-For-Beginners Public

    Forked from microsoft/Data-Science-For-Beginners

    10 Weeks, 20 Lessons, Data Science for All!

    Jupyter Notebook

  3. ML-For-Beginners ML-For-Beginners Public

    Forked from microsoft/ML-For-Beginners

    12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all

    HTML

  4. AI-For-Beginners AI-For-Beginners Public

    Forked from microsoft/AI-For-Beginners

    12 Weeks, 24 Lessons, AI for All!

    Jupyter Notebook

  5. Data-Science Data-Science Public

    Jupyter Notebook

  6. resume_scorer_haystack resume_scorer_haystack Public

    Resume scorer and analyzer using Haystack framework

    Python