Queues are commonly used to lineup work to be done. Consider a queue at a store. The first person in line is the first to be serviced. This is referred to as a “first in first out queue” (FIFO). This post details an implementation of a FIFO queue using a linked list. This post hopes to be a useful algorithm interview study reference. All code is available on github at dm03514/python-algorithms.
Queues support the following methods:
Data operations is the act of keeping data pipeline operational, and the data flowing. Data operational maturity focuses on answering:
This data operational maturity model draws heavily from Site Reliability Engineering (SRE). This data operational maturity model treats data operations as a software problem, in the same way that SRE is built on treating operations as a software problem, building observability into the very foundation of data systems.
Maturity is important to establish a uniform approach and thinking model. What do you think of when asked: “What’s…
Large data deployments often contain multiple levels of derived tables. Understanding lineage is critical for resolving data downtime and bugs. Often table hierarchies grow out of control with time, which makes maintaining documentation and debugging data issues time intensive and costly. This post describes a method for deriving data lineage from inspecting SQL statements. The database provides a centralized location where many different teams and ETL jobs come together, which makes it a great candidate for generating metadata like Lineage.
Database deployments often combine tables to create new tables. Lineage describes how a table is derived, i.e. which tables it…
TL;DR I hit the 1024 File descriptor limit in lambda. AWS has a bug handling this case, in which END and REPORT logs were omitted. The fixed required bounding File Descriptor usage by configuring the go http client (*http.Client).
I recently worked on a project to extend application log retention and reduce costs. Part of the solution involved aggregating raw application log files from S3. These log files range in size from range from 100KB → ~1MB and numbered ~30,000 files for an hour. A go program was created to aggregate the raw log files into 32MB files. The program…
Views are an important tool in legacy warehouse environments. Views act as an interface which decouples clients (like tableau, looker, applications, etc) from underlying data sources. Enabling backwards compatible changes is especially important in legacy environments, where the full scope of clients may be unknown. This article explains how views can be used to enable backwards compatible data migrations, which avoid the need to update any client queries.
Legacy warehouse environments for mid or large size companies may have 10’s or even 100’s of distinct clients. Clients can be:
Snowflake does not currently support sub query pruning which can have serious cost implications for DBT incremental updates against external tables. Careful care must be taken When using DBT incremental models to query against large partitioned external tables. This post shows how to address the incremental update by breaking the incremental query out of a subquery.
DBT Incremental models load data gradually. Each run focuses on a limited (i.e. incremental) dataset, opposed to a full data set. This requires using a query predicate to limit the dataset. This predicate is often based on event time. …
Probing is a technique to perform regular checks on a service using a short interval. Probes provide signals that can significantly cut down debug time. This post describe probes and how they can be used to drill down into errors and make debugging more focused; how they can partition the debug space.
Probes are targeted checks, performed as request / response actions, on a short (~1 minute or less) interval. Some common applications of probes are:
Successfully changing systems requires an understanding of the current system’s state. Profiling is a tool for understanding systems at a point in time. Without a good understanding of the current state, changes can be suboptimal, counter productive, or even dangerous. Profiling is used to breakdown a system’s current state using dimensions, and is a prerequisite for successfully modifying systems.
Profiling describes the current state of a system. Knowing the current state helps to inform changes. Consider the following goals and how profiling helps each.
JSON is the de-facto logging standard. JSON is so ubiquitous that the popular logging data tools (such as Elasticsearch) accept JSON by default. Although JSON is an evolution over previous logging standards, JSON’s lack of strict types make it insufficient to use for long-term persistence or as a foundation for a data-lake. This post describes the problem with JSON and proposes a solution using a strictly typed interchange format such as Protocol Buffers.
JSON logs establish an explicit structure. JSON parsers are available in most languages which make it accessible as a log standard. JSON logs are referred to as…
Alerting on SLOs is an SRE practice which enables teams to proactively be notified when a level of service is not being met. When an SLO alert fires teams can be confident that a client is impacted. Alternative alerting techniques have difficulty quantifying customer impact, which can complicate incident response. This post describes SLO alerts and the benefits they provide over alternatives. The Google SRE book describes how to alert on SLO’s, and this post aims to describe why to alert on SLOs.
SLOs are a quantifiable target representing a client’s experience. SLOs are built on SLIs, and SLIs are…