<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"><channel><title>Villoro</title><description>Villoro&apos;s personal website</description><link>https://villoro.com/</link><language>en</language><item><title>How to Be Nice to the Data Team</title><link>https://villoro.com/blog/be-nice-to-data-team/</link><guid isPermaLink="true">https://villoro.com/blog/be-nice-to-data-team/</guid><description>A practical guide for product, engineering, ops, and business teams on how to design source systems and frame requests so the data team can deliver faster and more reliably.</description><pubDate>Tue, 05 May 2026 00:00:00 GMT</pubDate><category>Data Engineering</category><category>Best Practices</category><author>Admin</author></item><item><title>Marimo notebooks for Python projects</title><link>https://villoro.com/blog/marimo-notebooks/</link><guid isPermaLink="true">https://villoro.com/blog/marimo-notebooks/</guid><description>Why marimo is an interesting notebook alternative for Python projects.</description><pubDate>Wed, 15 Apr 2026 00:00:00 GMT</pubDate><category>Python</category><category>Tools</category><category>AWS</category><category>DE</category><author>Admin</author></item><item><title>Recovering files from S3 using Delete Markers</title><link>https://villoro.com/blog/recover-s3-files-delete-markers/</link><guid isPermaLink="true">https://villoro.com/blog/recover-s3-files-delete-markers/</guid><description>Learn how to recover accidentally deleted S3 files using versioning and delete markers — including listing deleted objects, filtering by time window, and restoring them with boto3 and pandas.</description><pubDate>Tue, 17 Mar 2026 00:00:00 GMT</pubDate><category>DE</category><category>AWS</category><category>S3</category><category>Python</category><category>Best Practices</category><category>Versioning</category><author>Admin</author></item><item><title>Protecting Production Tables in dbt</title><link>https://villoro.com/blog/protecting-prod-tables-dbt/</link><guid isPermaLink="true">https://villoro.com/blog/protecting-prod-tables-dbt/</guid><description>Learn three proven strategies to safeguard production tables in dbt: protecting against bad data, preventing unexpected changes, and securing sensitive data with seeds.</description><pubDate>Tue, 17 Feb 2026 00:00:00 GMT</pubDate><category>dbt</category><category>SQL</category><category>DE</category><author>Admin</author></item><item><title>Scalable GitHub Actions for Modern Repos</title><link>https://villoro.com/blog/github-hooks/</link><guid isPermaLink="true">https://villoro.com/blog/github-hooks/</guid><description>Learn how to build scalable GitHub Actions pipelines using reusable hooks, matrix jobs, gate checks, and concurrency for cleaner, faster, and more maintainable CI/CD workflows.</description><pubDate>Wed, 14 Jan 2026 00:00:00 GMT</pubDate><category>CI/CD</category><category>GitHub</category><category>Python</category><category>Best Practices</category><category>Monorepo</category><author>Admin</author></item><item><title>Scaling ECS Python Deployments with a Modular Monorepo</title><link>https://villoro.com/blog/monorepo-multimage/</link><guid isPermaLink="true">https://villoro.com/blog/monorepo-multimage/</guid><description>Learn how to evolve a Python project from a single workspace to a scalable, multi-package monorepo with isolated builds, targeted CI, and lightweight ECS deployments using uv.</description><pubDate>Tue, 09 Dec 2025 00:00:00 GMT</pubDate><category>AWS</category><category>ECS</category><category>Docker</category><category>uv</category><category>DE</category><category>Fargate</category><category>Monorepo</category><author>Admin</author></item><item><title>Fast and Reproducible Python Deployments on ECS with uv</title><link>https://villoro.com/blog/fast-python-ecs-with-uv/</link><guid isPermaLink="true">https://villoro.com/blog/fast-python-ecs-with-uv/</guid><description>Speed up ECS tasks with lightweight Docker images and versioned Python environments using uv. Learn how to build once, run anywhere with zero rebuilds and instant cold starts.</description><pubDate>Tue, 25 Nov 2025 00:00:00 GMT</pubDate><category>AWS</category><category>ECS</category><category>Docker</category><category>uv</category><category>DE</category><category>Fargate</category><author>Admin</author></item><item><title>Querying APIs Best Practices</title><link>https://villoro.com/blog/querying-apis-best-practices/</link><guid isPermaLink="true">https://villoro.com/blog/querying-apis-best-practices/</guid><description>Learn proven best practices for building resilient Python API clients: structure, safety, error handling, and advanced patterns like streaming and incremental loads.</description><pubDate>Tue, 14 Oct 2025 00:00:00 GMT</pubDate><category>Python</category><category>API</category><category>DE</category><category>Best Practices</category><author>Admin</author></item><item><title>DuckDB and Beyond</title><link>https://villoro.com/blog/duckdb-fast-analytics/</link><guid isPermaLink="true">https://villoro.com/blog/duckdb-fast-analytics/</guid><description>DuckDB is the “SQLite for analytics”: fast, modern SQL, and capable of billions of rows on a laptop. Learn its SQL superpowers, explore practical use cases, and discover the promise of DuckLake.</description><pubDate>Mon, 15 Sep 2025 00:00:00 GMT</pubDate><category>Python</category><category>SQL</category><category>DE</category><author>Admin</author></item><item><title>Understanding VPNs - Privacy and Security</title><link>https://villoro.com/blog/vpn-tailscale/</link><guid isPermaLink="true">https://villoro.com/blog/vpn-tailscale/</guid><description>Explore the differences between commercial VPNs for privacy and security and private network VPNs for secure remote access. Learn how Tailscale simplifies private VPN setups and how exit nodes can help bypass restrictions.</description><pubDate>Fri, 08 Aug 2025 00:00:00 GMT</pubDate><category>Tools</category><category>Setup</category><category>Security</category><category>Networking</category><author>Admin</author></item><item><title>Handling pipeline failures</title><link>https://villoro.com/blog/handling-pipeline-failures/</link><guid isPermaLink="true">https://villoro.com/blog/handling-pipeline-failures/</guid><description>Learn how to make your data pipelines resilient by design using the ETU (Extract, Transform, Upload) approach. This guide covers failure handling strategies in dbt, Prefect, and more.</description><pubDate>Thu, 12 Jun 2025 00:00:00 GMT</pubDate><category>SQL</category><category>DE</category><category>Best Practices</category><category>dbt</category><category>Prefect</category><author>Admin</author></item><item><title>Time Series Smoothing Techniques in Python and SQL</title><link>https://villoro.com/blog/smoothing-time-series/</link><guid isPermaLink="true">https://villoro.com/blog/smoothing-time-series/</guid><description>A practical guide to smoothing time series data using Python and SQL. Explore moving averages, Gaussian and Lowess smoothers, SQL medians, quantiles, and their trade-offs through side-by-side visual comparisons.</description><pubDate>Thu, 08 May 2025 00:00:00 GMT</pubDate><category>DE</category><category>Python</category><category>SQL</category><category>Best Practices</category><author>Admin</author></item><item><title>Improving Athena performance by reducing Glue versions</title><link>https://villoro.com/blog/improving-aws-athena-performance-by-reducing-glue-versions/</link><guid isPermaLink="true">https://villoro.com/blog/improving-aws-athena-performance-by-reducing-glue-versions/</guid><description>Learn how AWS Glue schema version limits can silently break Athena queries and dbt runs—and how cleaning up old versions can improve performance by up to 4×.</description><pubDate>Thu, 10 Apr 2025 00:00:00 GMT</pubDate><category>DE</category><category>AWS</category><category>Best Practices</category><category>Performance</category><category>dbt</category><author>Admin</author></item><item><title>Fast Python Package Management and Linting with uv and ruff</title><link>https://villoro.com/blog/astral-tools-uv-and-ruff/</link><guid isPermaLink="true">https://villoro.com/blog/astral-tools-uv-and-ruff/</guid><description>Discover how Astral&apos;s Rust-powered tools, uv and ruff, provide a blazing-fast alternative for Python package management and linting. Learn how to set them up, integrate them into CI/CD workflows, and boost your development speed.</description><pubDate>Fri, 14 Mar 2025 00:00:00 GMT</pubDate><category>Python</category><category>Tools</category><category>Performance</category><category>Benchmark</category><category>Setup</category><category>Best Practices</category><author>Admin</author></item><item><title>Managing S3 Costs with Inventory and Lifecycle Policies</title><link>https://villoro.com/blog/manage-s3-costs/</link><guid isPermaLink="true">https://villoro.com/blog/manage-s3-costs/</guid><description>Discover strategies to optimize Amazon S3 costs by leveraging inventory exports and lifecycle policies for automated storage management.</description><pubDate>Fri, 28 Feb 2025 00:00:00 GMT</pubDate><category>AWS</category><category>S3</category><category>Best Practices</category><author>Admin</author></item><item><title>How to Set Up MEGASync on a TerraMaster NAS Using Docker</title><link>https://villoro.com/blog/megasync-terramaster-docker-setup/</link><guid isPermaLink="true">https://villoro.com/blog/megasync-terramaster-docker-setup/</guid><description>A step-by-step guide to installing and configuring MEGASync on a TerraMaster NAS using Docker for seamless MEGA cloud synchronization.</description><pubDate>Wed, 15 Jan 2025 00:00:00 GMT</pubDate><category>NAS</category><category>Hardware</category><category>Terramaster</category><category>Docker</category><author>Admin</author></item><item><title>Setting Up and Optimizing a Terramaster NAS</title><link>https://villoro.com/blog/setting-up-terramaster-nas/</link><guid isPermaLink="true">https://villoro.com/blog/setting-up-terramaster-nas/</guid><description>A comprehensive guide on setting up and configuring the Terramaster F2-424 NAS, including hardware selection, RAID setup, storage configuration, network optimization, and essential app installations.</description><pubDate>Sat, 21 Dec 2024 00:00:00 GMT</pubDate><category>NAS</category><category>Hardware</category><category>Terramaster</category><author>Admin</author></item><item><title>How to Control AMD CPU Temperatures</title><link>https://villoro.com/blog/control-amd-cpu-temperatures/</link><guid isPermaLink="true">https://villoro.com/blog/control-amd-cpu-temperatures/</guid><description>Learn how to control your AMD CPU&apos;s temperatures by configuring TDP limits to apply automatically at startup using Task Scheduler in Windows.</description><pubDate>Fri, 29 Nov 2024 00:00:00 GMT</pubDate><category>Tutorial</category><category>Setup</category><category>Performance</category><category>Windows</category><author>Admin</author></item><item><title>Solving the problem with small files in the Data Lake</title><link>https://villoro.com/blog/solving-problem-small-files-data-lake/</link><guid isPermaLink="true">https://villoro.com/blog/solving-problem-small-files-data-lake/</guid><description>This post addresses the common problem of small files in data lakes, which can lead to significant performance degradation and increased costs. It provides an in-depth guide on understanding the issues caused by small files, determining optimal file sizes, and effectively managing file sizes using tools like Apache Spark, AWS Athena, Delta Lake, and Apache Iceberg. The post also covers strategies for tracking file sizes and partitioning tables to optimize data processing and storage efficiency.</description><pubDate>Wed, 23 Oct 2024 00:00:00 GMT</pubDate><category>DE</category><category>Performance</category><category>Spark</category><category>dbt</category><author>Admin</author></item><item><title>Concurrent Async Calls to OpenAI with Rate Limiting</title><link>https://villoro.com/blog/async-openai-calls-rate-limiter/</link><guid isPermaLink="true">https://villoro.com/blog/async-openai-calls-rate-limiter/</guid><description>Learn how to implement multiple concurrent async calls to the OpenAI API while managing quota limits effectively using a rate limiter in Python. This guide covers best practices for API management in data engineering.</description><pubDate>Wed, 25 Sep 2024 00:00:00 GMT</pubDate><category>API</category><category>Best Practices</category><category>DE</category><category>OpenAI</category><category>Python</category><author>Admin</author></item><item><title>Transcribing Audios with Whisper and Structuring Data with ChatGPT</title><link>https://villoro.com/blog/transcribe-audios-whisper-extract-structured-data-chatgpt/</link><guid isPermaLink="true">https://villoro.com/blog/transcribe-audios-whisper-extract-structured-data-chatgpt/</guid><description>Learn how to extract and analyze call data using OpenAI models, with a focus on structured outputs for clear and consistent results. This guide covers transcribing calls with Whisper, extracting insights with ChatGPT, handling invalid calls, and processing multiple files efficiently. Explore how to use Pydantic models and OpenAI&apos;s API for structured data extraction.</description><pubDate>Mon, 02 Sep 2024 00:00:00 GMT</pubDate><category>Python</category><category>DE</category><category>AI</category><category>Tutorial</category><category>OpenAI</category><author>Admin</author></item><item><title>How to Perform a Clean Windows Installation: A Step-by-Step Guide</title><link>https://villoro.com/blog/clean-windows-install/</link><guid isPermaLink="true">https://villoro.com/blog/clean-windows-install/</guid><description>Learn how to download, install, and set up a clean Windows installation without bloatware. This comprehensive guide covers creating a bootable USB, setting up a local account, and configuring essential settings for optimal performance.</description><pubDate>Wed, 14 Aug 2024 00:00:00 GMT</pubDate><category>Tutorial</category><category>Setup</category><category>Windows</category><author>Admin</author></item><item><title>Extracting data from Salesforce with Python</title><link>https://villoro.com/blog/extracting-data-salesforce-python/</link><guid isPermaLink="true">https://villoro.com/blog/extracting-data-salesforce-python/</guid><description>The post serves as a comprehensive guide to extracting data from Salesforce using Python, focusing on the Simple Salesforce library. It covers various querying options, strategies to avoid timeouts and memory errors, and insights into different types of fields within Salesforce. Additionally, it provides practical code examples and recommendations for effective data extraction.</description><pubDate>Tue, 16 Jul 2024 00:00:00 GMT</pubDate><category>Python</category><category>DE</category><category>Tutorial</category><author>Admin</author></item><item><title>dbt testing with DuckDB</title><link>https://villoro.com/blog/dbt-testing-duckdb/</link><guid isPermaLink="true">https://villoro.com/blog/dbt-testing-duckdb/</guid><description>Learn how to test dbt projects using DuckDB to ensure high-quality, consistent SQL code. This guide covers setting up a SQL linter with Sqlfluff, automating testing with pre-commit hooks, and creating a streamlined CI pipeline for efficient testing.</description><pubDate>Wed, 19 Jun 2024 00:00:00 GMT</pubDate><category>SQL</category><category>DE</category><category>Best Practices</category><category>dbt</category><author>Admin</author></item><item><title>Running dbt with AWS ECS (Fargate)</title><link>https://villoro.com/blog/running-dbt-with-aws-ecs-fargate/</link><guid isPermaLink="true">https://villoro.com/blog/running-dbt-with-aws-ecs-fargate/</guid><description>Guide to setting up and running dbt on AWS ECS (Fargate), covering Dockerization, package handling, Docker images, integration with Prefect and exporting results.</description><pubDate>Fri, 07 Jun 2024 00:00:00 GMT</pubDate><category>SQL</category><category>DE</category><category>AWS</category><category>Setup</category><category>Tutorial</category><category>dbt</category><author>Admin</author></item><item><title>Self-Healing Pipelines</title><link>https://villoro.com/blog/self-healing-pipelines/</link><guid isPermaLink="true">https://villoro.com/blog/self-healing-pipelines/</guid><description>This post explores self-healing pipelines, which automate backfilling missing data to reduce manual interventions. It discusses challenges in traditional batch processing and introduces techniques like partitioning, deduplication, and schema change handling. Detailed Python and SQL code snippets illustrate implementation, emphasizing the importance of proper setup in dbt.</description><pubDate>Thu, 23 May 2024 00:00:00 GMT</pubDate><category>DE</category><category>Best Practices</category><author>Admin</author></item><item><title>Effortless EMR: A Guide to Seamlessly Running PySpark Code</title><link>https://villoro.com/blog/effortless-emr-guide-running-pyspark/</link><guid isPermaLink="true">https://villoro.com/blog/effortless-emr-guide-running-pyspark/</guid><description>Tired of complex setups for PySpark on EMR? This guide offers a simpler approach.
From running scripts to handling parameters and using Docker, we cover it all.
With practical examples and clear explanations, simplify your PySpark workflow on EMR and make big data processing a breeze!
</description><pubDate>Tue, 07 May 2024 00:00:00 GMT</pubDate><category>AWS</category><category>Tutorial</category><author>Admin</author></item><item><title>Setting Up and Deploying Prefect Server: A Comprehensive Guide</title><link>https://villoro.com/blog/prefect-server-setup-configuration-deployment/</link><guid isPermaLink="true">https://villoro.com/blog/prefect-server-setup-configuration-deployment/</guid><description>Learn how to set up and deploy Prefect Server for orchestrating workflows seamlessly. Understand the steps involved, including server configuration, creating deployments, and connecting to the server</description><pubDate>Thu, 11 Apr 2024 00:00:00 GMT</pubDate><category>Tools</category><category>Orchestration</category><category>Tutorial</category><category>Setup</category><author>Admin</author></item><item><title>Prefect Essentials: Basics, Setup and Migration</title><link>https://villoro.com/blog/prefect-essentials-setup-and-migration/</link><guid isPermaLink="true">https://villoro.com/blog/prefect-essentials-setup-and-migration/</guid><description>Discover Prefect&apos;s simplicity and flexibility. Learn the basics, setup Prefect Cloud, and smoothly migrate from other orchestrators.</description><pubDate>Mon, 08 Apr 2024 00:00:00 GMT</pubDate><category>Tools</category><category>Orchestration</category><category>Intro</category><category>Setup</category><author>Admin</author></item><item><title>Clean repo by rewriting GIT history</title><link>https://villoro.com/blog/clean-repo-rewrite-git-history/</link><guid isPermaLink="true">https://villoro.com/blog/clean-repo-rewrite-git-history/</guid><description>Discover how to clean up your Git history effortlessly with our step-by-step guide. Learn to eliminate unnecessary commits, maintain versioning consistency, and automate the process with Python scripts. Streamline your repository and enhance collaboration today!</description><pubDate>Sun, 17 Mar 2024 00:00:00 GMT</pubDate><category>GIT</category><author>Admin</author></item><item><title>Managing package versions with Poetry</title><link>https://villoro.com/blog/managing-package-versions-with-poetry/</link><guid isPermaLink="true">https://villoro.com/blog/managing-package-versions-with-poetry/</guid><description>This post explains how to automate project version management using Poetry and poetry-bumpversion.
It follows semantic versioning and recommends tools for updating versions.
GitHub actions are set up to automatically update the version, commit changes when necessary, and tag commits on the main branch.
</description><pubDate>Tue, 07 Nov 2023 00:00:00 GMT</pubDate><category>Python</category><category>Tools</category><category>Poetry</category><category>Versioning</category><author>Admin</author></item><item><title>Efficiently reading large volumes of data from redshift with pyarrow</title><link>https://villoro.com/blog/reading-from-redshift-with-pyarrow/</link><guid isPermaLink="true">https://villoro.com/blog/reading-from-redshift-with-pyarrow/</guid><description>This post discusses efficient ways to consume data from Amazon Redshift as pandas dataframes.
It outlines the challenges of scaling Redshift performance, especially with bots consuming data, and provides solutions for more efficient data access.
The recommended approach is unloading data to Parquet files, and the post explains various methods for reading Parquet files into pandas dataframes using pandas, pyarrow, and pyarrow.dataset.
The performance comparisons demonstrate the advantages of the new pyarrow.dataset introduced in pyarrow 3.0.0 for handling partitioned data efficiently.
</description><pubDate>Tue, 08 Jun 2021 00:00:00 GMT</pubDate><category>Python</category><category>DE</category><category>SQL</category><category>AWS</category><author>Admin</author></item><item><title>Using regexs with python</title><link>https://villoro.com/blog/regex-python/</link><guid isPermaLink="true">https://villoro.com/blog/regex-python/</guid><description>This post provides a detailed guide on regular expressions (regex), covering their basics, character classes, quantifiers, groups, boundaries, flags, and more.
It also explains how to use regex in Python, Pandas, and Amazon Redshift, offering practical examples and functions for working with text data.
This comprehensive resource helps users understand and apply regex in various programming contexts.
</description><pubDate>Mon, 05 Apr 2021 00:00:00 GMT</pubDate><category>Python</category><author>Admin</author></item><item><title>ING API with python</title><link>https://villoro.com/blog/ing-api-python/</link><guid isPermaLink="true">https://villoro.com/blog/ing-api-python/</guid><description>Learn how to interact with ING&apos;s Open Banking API using Python. This guide covers setting up certificates, making authenticated requests, and integrating with the ING API to perform basic operations.</description><pubDate>Wed, 10 Mar 2021 00:00:00 GMT</pubDate><category>Python</category><category>DE</category><category>API</category><author>Admin</author></item><item><title>Java UDF with pyspark</title><link>https://villoro.com/blog/java-udf-pyspark/</link><guid isPermaLink="true">https://villoro.com/blog/java-udf-pyspark/</guid><description>Learn how to create and use Java UDFs with PySpark for improved performance. This guide covers setting up Java functions, compiling them, and integrating them into your PySpark workflows, with a performance comparison to Python UDFs.</description><pubDate>Tue, 03 Nov 2020 00:00:00 GMT</pubDate><category>Python</category><category>Spark</category><category>Benchmark</category><author>Admin</author></item><item><title>Setting up a Python environment</title><link>https://villoro.com/blog/python-environment-configuration/</link><guid isPermaLink="true">https://villoro.com/blog/python-environment-configuration/</guid><description>Learn how to properly set up a python development environment with all the useful tools you can use as well as some good tips to work as a pro.</description><pubDate>Wed, 20 May 2020 00:00:00 GMT</pubDate><category>Python</category><category>Tools</category><author>Admin</author></item><item><title>Using Pre-commit to automate tasks</title><link>https://villoro.com/blog/pre-commit/</link><guid isPermaLink="true">https://villoro.com/blog/pre-commit/</guid><description>With pre-commit you can automate some repetitive tasks before doing a commit on a git repository.
It is useful for identifying simple issues before submission to code review.
</description><pubDate>Thu, 14 May 2020 00:00:00 GMT</pubDate><category>Tools</category><author>Admin</author></item><item><title>Poetry python package manager</title><link>https://villoro.com/blog/poetry-python-package-manager/</link><guid isPermaLink="true">https://villoro.com/blog/poetry-python-package-manager/</guid><description>Poetry is a python packaging and dependency manager.
It makes it really easy to manage packages while using environments under the hood.
It also allows build and publishing packages to PyPI (or other sites).
</description><pubDate>Fri, 08 May 2020 00:00:00 GMT</pubDate><category>Python</category><category>Tools</category><author>Admin</author></item><item><title>Cmder terminal for windows</title><link>https://villoro.com/blog/cmder-terminal-windows/</link><guid isPermaLink="true">https://villoro.com/blog/cmder-terminal-windows/</guid><description>Cmder is a really good terminal for windows that can replace the default one.</description><pubDate>Sat, 29 Feb 2020 00:00:00 GMT</pubDate><category>Tools</category><author>Admin</author></item><item><title>Luigi orchestrator</title><link>https://villoro.com/blog/luigi-orchestrator/</link><guid isPermaLink="true">https://villoro.com/blog/luigi-orchestrator/</guid><description>Luigi is a Python package that helps you build complex pipelines of batch jobs.
It handles dependency resolution, workflow management, visualization, handling failures, command line integration, and much more.
</description><pubDate>Mon, 06 Jan 2020 00:00:00 GMT</pubDate><category>Tools</category><category>Orchestration</category><author>Admin</author></item><item><title>PySpark example with parquet</title><link>https://villoro.com/blog/pyspark-example/</link><guid isPermaLink="true">https://villoro.com/blog/pyspark-example/</guid><description>This is an example of how to work with spark dataframes using as an example a CSV with 22 GB.
This should be an example that you probably won&apos;t be able to work with pandas.
It will give some best practices for working with data of this size.
</description><pubDate>Tue, 24 Dec 2019 00:00:00 GMT</pubDate><category>Python</category><category>Spark</category><author>Admin</author></item><item><title>Databricks intro</title><link>https://villoro.com/blog/databricks-intro/</link><guid isPermaLink="true">https://villoro.com/blog/databricks-intro/</guid><description>Databricks provides a Unified Analytics Platform that accelerates innovation by unifying data science, engineering and business.
Learn how to set databricks up and what it can offer you.
</description><pubDate>Mon, 23 Dec 2019 00:00:00 GMT</pubDate><category>Tools</category><category>Spark</category><author>Admin</author></item><item><title>H2O AutoML</title><link>https://villoro.com/blog/h2o-automl/</link><guid isPermaLink="true">https://villoro.com/blog/h2o-automl/</guid><description>H2O is a fully open source, distributed in-memory machine learning platform with linear scalability.
In this post you will see how to use AutoML to let h2o train multiple Machine Learning algorithms automatically.
</description><pubDate>Sun, 22 Dec 2019 00:00:00 GMT</pubDate><category>Python</category><category>Spark</category><category>ML</category><category>Intro</category><author>Admin</author></item><item><title>Pyspark Intro</title><link>https://villoro.com/blog/pyspark-intro/</link><guid isPermaLink="true">https://villoro.com/blog/pyspark-intro/</guid><description>Apache Spark is a very fast unified analytics engine for big data and machine learning. This is a beginner tutorial of how to use spark to work with dataframes.</description><pubDate>Sat, 21 Dec 2019 00:00:00 GMT</pubDate><category>Python</category><category>Spark</category><category>Intro</category><author>Admin</author></item><item><title>Pandas Intro</title><link>https://villoro.com/blog/pandas-intro/</link><guid isPermaLink="true">https://villoro.com/blog/pandas-intro/</guid><description>Learn how to use python pandas library to work with tabular data.</description><pubDate>Fri, 01 Nov 2019 00:00:00 GMT</pubDate><category>Python</category><category>Pandas</category><category>Intro</category><author>Admin</author></item><item><title>Reading and writing files with python using Dropbox</title><link>https://villoro.com/blog/dropbox-python/</link><guid isPermaLink="true">https://villoro.com/blog/dropbox-python/</guid><description>Learn how to read, write and delete all kind of files using the dropbox library for python.</description><pubDate>Sat, 14 Sep 2019 00:00:00 GMT</pubDate><category>Python</category><category>Pandas</category><category>DE</category><author>Admin</author></item><item><title>Google Analytics ETL with python</title><link>https://villoro.com/blog/google-analytics-etl-python/</link><guid isPermaLink="true">https://villoro.com/blog/google-analytics-etl-python/</guid><description>Google Analytics is really useful but sometimes you want to retrieve all the data and perform your own analysis and create your reports. In this post you will see how to do that with python.</description><pubDate>Thu, 04 Jul 2019 00:00:00 GMT</pubDate><category>API</category><category>Python</category><category>Pandas</category><category>DE</category><category>Tutorial</category><author>Admin</author></item><item><title>Using git with SSH</title><link>https://villoro.com/blog/git-with-ssh/</link><guid isPermaLink="true">https://villoro.com/blog/git-with-ssh/</guid><description>Learn how to work securely with git by using SSH authentication with github.</description><pubDate>Sat, 15 Jun 2019 00:00:00 GMT</pubDate><category>GIT</category><category>Security</category><category>Setup</category><author>Admin</author></item><item><title>Using gitflow for code control</title><link>https://villoro.com/blog/using-gitflow/</link><guid isPermaLink="true">https://villoro.com/blog/using-gitflow/</guid><description>Development good practices: how to using gitflow as code control.</description><pubDate>Tue, 04 Jun 2019 00:00:00 GMT</pubDate><category>GIT</category><category>Best Practices</category><author>Admin</author></item><item><title>Improving python performance with Numba</title><link>https://villoro.com/blog/improve-python-performance-with-numba/</link><guid isPermaLink="true">https://villoro.com/blog/improve-python-performance-with-numba/</guid><description>Numba translates Python functions to optimized machine code at runtime. Numba-compiled numerical algorithms in Python can approach the speeds of C or FORTRAN.</description><pubDate>Tue, 21 May 2019 00:00:00 GMT</pubDate><category>Python</category><category>Benchmark</category><category>Performance</category><author>Admin</author></item><item><title>Logging with python</title><link>https://villoro.com/blog/python-logging/</link><guid isPermaLink="true">https://villoro.com/blog/python-logging/</guid><description>An overview of my own custom logging library that handles console and file output.</description><pubDate>Thu, 16 May 2019 00:00:00 GMT</pubDate><category>Python</category><category>Tools</category><author>Admin</author></item><item><title>Encrypting secrets with python</title><link>https://villoro.com/blog/encrypting-secrets-python/</link><guid isPermaLink="true">https://villoro.com/blog/encrypting-secrets-python/</guid><description>Learn how a team can work securely with password by storing them encrypted in a dictionary-like file.</description><pubDate>Fri, 10 May 2019 00:00:00 GMT</pubDate><category>Python</category><category>Security</category><author>Admin</author></item><item><title>Creating Python packages</title><link>https://villoro.com/blog/creating-python-packages/</link><guid isPermaLink="true">https://villoro.com/blog/creating-python-packages/</guid><description>Learn how to create a python package that can be installed with pip.</description><pubDate>Thu, 02 May 2019 00:00:00 GMT</pubDate><category>Python</category><category>Tools</category><category>PyPI</category><author>Admin</author></item><item><title>Using AWS Lambda to control EC2 instances</title><link>https://villoro.com/blog/aws-lambdas-to-control-ec2/</link><guid isPermaLink="true">https://villoro.com/blog/aws-lambdas-to-control-ec2/</guid><description>Learn how to use AWS Lambdas to start/stop EC2 and RDS instances using an API.</description><pubDate>Tue, 23 Apr 2019 00:00:00 GMT</pubDate><category>Python</category><category>AWS</category><author>Admin</author></item><item><title>Python decorators</title><link>https://villoro.com/blog/python-decorators/</link><guid isPermaLink="true">https://villoro.com/blog/python-decorators/</guid><description>Decorators are a really useful tool for python development but not a lot of people know what they are and how they work. In this post you are going to master them so that you can use them and create your own.</description><pubDate>Tue, 16 Apr 2019 00:00:00 GMT</pubDate><category>Python</category><author>Admin</author></item><item><title>Intro to Machine Learning (ML)</title><link>https://villoro.com/blog/intro-machine-learning/</link><guid isPermaLink="true">https://villoro.com/blog/intro-machine-learning/</guid><description>An introduction to what Machine Learning is and how it works using a real example.</description><pubDate>Fri, 12 Apr 2019 00:00:00 GMT</pubDate><category>Python</category><category>ML</category><category>Intro</category><category>Tutorial</category><author>Admin</author></item><item><title>Storing tables efficiently with Pandas</title><link>https://villoro.com/blog/storing-tables-efficiently-pandas/</link><guid isPermaLink="true">https://villoro.com/blog/storing-tables-efficiently-pandas/</guid><description>When working with pandas people usually need to store one or more tables. There are a lot of different formats to do that. In this post I am going to compare the performance between them.</description><pubDate>Wed, 03 Apr 2019 00:00:00 GMT</pubDate><category>Python</category><category>Benchmark</category><category>Performance</category><author>Admin</author></item><item><title>Serving python apps with nginx and gunicorn</title><link>https://villoro.com/blog/serving-python-app-nginx-gunicorn/</link><guid isPermaLink="true">https://villoro.com/blog/serving-python-app-nginx-gunicorn/</guid><description>How to serve Python apps with nginx and gunicorn.</description><pubDate>Thu, 21 Mar 2019 00:00:00 GMT</pubDate><category>Python</category><category>Web</category><author>Admin</author></item><item><title>Setting up Airflow</title><link>https://villoro.com/blog/setting-up-airflow/</link><guid isPermaLink="true">https://villoro.com/blog/setting-up-airflow/</guid><description>How to set up Apache Airflow, the platform to programmatically author, schedule and monitor workflows.</description><pubDate>Tue, 19 Mar 2019 00:00:00 GMT</pubDate><category>Tools</category><category>Orchestration</category><category>Tutorial</category><category>Setup</category><author>Admin</author></item><item><title>Creating EC2 instances in AWS</title><link>https://villoro.com/blog/creating-ec2-aws/</link><guid isPermaLink="true">https://villoro.com/blog/creating-ec2-aws/</guid><description>How to create an EC2 AWS instance. Detailed instructions on how to do it in some minutes.</description><pubDate>Mon, 18 Mar 2019 00:00:00 GMT</pubDate><category>AWS</category><category>Tutorial</category><category>Setup</category><author>Admin</author></item><item><title>How to create a static website</title><link>https://villoro.com/blog/create-static-website/</link><guid isPermaLink="true">https://villoro.com/blog/create-static-website/</guid><description>See how I created a static webpage in some minutes. And how you can do it too.</description><pubDate>Tue, 12 Mar 2019 00:00:00 GMT</pubDate><category>Python</category><category>Web</category><category>Tutorial</category><category>Setup</category><author>Admin</author></item><item><title>Basic tools for blogs</title><link>https://villoro.com/blog/basic-tools-for-blogs/</link><guid isPermaLink="true">https://villoro.com/blog/basic-tools-for-blogs/</guid><description>View how to set up a mailing list (Mailchimp), add a comments section (Disqus) on posts and track visits (Google Analytics).</description><pubDate>Sat, 02 Mar 2019 00:00:00 GMT</pubDate><category>Web</category><category>Tools</category><category>Setup</category><author>Admin</author></item><item><title>Singleton in Python</title><link>https://villoro.com/blog/singleton-python/</link><guid isPermaLink="true">https://villoro.com/blog/singleton-python/</guid><description>The singleton pattern will help you have only one instance and allow you to call only one sync function.</description><pubDate>Sun, 24 Feb 2019 00:00:00 GMT</pubDate><category>Python</category><author>Admin</author></item><item><title>String best practices with Python</title><link>https://villoro.com/blog/string-best-practices-with-python/</link><guid isPermaLink="true">https://villoro.com/blog/string-best-practices-with-python/</guid><description>There are a lot of ways to work with strings in Python. And there are some cool tricks I want to share that will make it easier to deal with strings.</description><pubDate>Sat, 16 Feb 2019 00:00:00 GMT</pubDate><category>Python</category><category>Best Practices</category><author>Admin</author></item><item><title>Using Material colors and other palettes</title><link>https://villoro.com/blog/material-colors-and-palettes/</link><guid isPermaLink="true">https://villoro.com/blog/material-colors-and-palettes/</guid><description>Choosing between colors is always a difficult task. But with the material design palette it is possible to solve most of color related problems easily.</description><pubDate>Fri, 15 Feb 2019 00:00:00 GMT</pubDate><category>Python</category><category>Web</category><category>Best Practices</category><author>Admin</author></item><item><title>SQL Alchemy with python</title><link>https://villoro.com/blog/sql-alchemy-python/</link><guid isPermaLink="true">https://villoro.com/blog/sql-alchemy-python/</guid><description>SQL is one of the most relevant languages for databases. So at one point you will need to interact to get, store or modify some data. Fortunately you can do it in python using SQL Alchemy.</description><pubDate>Fri, 08 Feb 2019 00:00:00 GMT</pubDate><category>Python</category><category>SQL</category><category>Intro</category><category>Tutorial</category><author>Admin</author></item><item><title>Black for code formatting</title><link>https://villoro.com/blog/black-code-formatting/</link><guid isPermaLink="true">https://villoro.com/blog/black-code-formatting/</guid><description>Black is the uncompromising Python code formatter.
By using it, you agree to cede control over minutiae of hand-formatting.
In return, Black gives you speed, determinism, and freedom from pycodestyle nagging about formatting.
</description><pubDate>Thu, 07 Feb 2019 00:00:00 GMT</pubDate><category>Python</category><category>Tools</category><category>Best Practices</category><author>Admin</author></item><item><title>Personal webpage</title><link>https://villoro.com/blog/personal-webpage/</link><guid isPermaLink="true">https://villoro.com/blog/personal-webpage/</guid><description>I used to have a personal webpage made with WordPress. Since I have been working as a developer for some years I decided that I could create something that better suited my needs.</description><pubDate>Thu, 27 Dec 2018 00:00:00 GMT</pubDate><category>Web</category><category>Python</category><category>Setup</category><author>Admin</author></item></channel></rss>