Notes from Industry

Manipulate Pandas data frames from the comfort of your own browser!

We all love processing data, but sometimes it can be challenging to relay what we did to the data. I wanted to build an application that would help people define data flows while providing an intuitive and easy way to edit and share these flows with other people. Writing a data pipeline as a script is easy and effective for people with a solid technical background, but how does one show this to a manager or stakeholder? I decided it was time to build an application that did just this! …

FastAPI is an amazing library, but we can improve life improvements when using it in a microservices architecture

image by Zetong Li on unsplash

The microservices paradigm essentially involves a set of small discrete mini-applications working together as a whole larger application. This architecture enables smaller teams to support smaller parts of the application and clearly defines contracts between the different parts of an application. The most common way for microservices to communicate in this set up is via HTTP/RESTful API. When building applications as a microservice, it is common to use a service discovery and configuration application like Consul.

Very rarely do I come across a library and think, “wow, this is beautiful”. The last time was the enigmatic Requests library for python…

The default charts out of the box look great, but I often find myself making a few common changes to make them “pop”.

It wasn’t until recently, while I was building a reporting pipeline did I discover the custom templating function of plotly. Before this, I was producing charts and editing how they looked afterwards. With the custom templating functionality in plotly, you can define a custom style applied as the chart is created. This handy feature reduced my reporting pipeline's complexity and has become a core part of how I use plotly both professionally and at home.

To see how this works, let's create a toy dataset. See below for the required imports and the generation of the dataset to help illustrate…

There is a simple reason why Directed Acyclic Graphs (DAG) are used by most of the top workload managers!

A DAG expresses a set of interconnected nodes and puts a few hard limits on how those nodes can be connected. A DAG differs from a regular Graph Network in two ways: each connection between nodes represents a one-way relationship. Relationships between nodes can not result in an infinite loop.

The diagram below shows a directed graph. However, it is not acyclic because if you were to step around the graph's nodes following the edges' directions, you would end up passing the starting point an infinite amount of times, nor would you even reach a final node to stop on.

A Cyclic Directed Graph (Image by Author)

Inspect all the things! Save on configuration! Now with 30% less technical debt!

image by author

Currently, I am building a web application with a Python back end. Essentially the back end performs operations upon Pandas data frames, while the front end is a react application allowing a drag and drop interface for building a graph of operations to process and analyze data!

As part of this application, I needed a way to define the types of processing nodes used in this application. This meant both the front end and the back end required knowledge of which nodes were defined and the arguments needed for each different type of node. Initially, I solved this issue by…

It’s important to consider the failure cases and return the appropriate response codes. Errors aren’t as bad when you know about them!

Most people familiar with web technology should know codes like a 200 Success or 404 Not Found. That's great; you can get a really long way in API development with just those codes alone. However, many other response codes can give a little more context back to the requester. This article will talk a little bit more about what types of codes should be sent for success and failure from CRUD endpoints in an API.

Photo by Erik Mclean on Unsplash

Response Codes

200 Success

The requester has made a request in which it expects the API to return some data. The API has located/constructed the data and returned it…

Cloud-based data-warehousings has significantly impacted data science and analytics. Is this a bad thing?

Photo by Michael & Diane Weidner on Unsplash

Data science and analytics in the cloud is the modern paradigm for big data. Cloud storage and processing for data science and analytics arose due to the costs and requirements of storing and processing the ever-increasing amount of data on earth. The clouds elasticity, cost benefits, security and physical location are the most fundamental factors in this situation. There have been large changes to the way data science and analytics in general are implemented due to the increasingly common move from traditional on premises data warehousing, to modern cloud based alternatives.

Born of the ever-increasing amount of data on earth…

A simple effects pedal using a TL071 as the signal amplifier.

Image by Author

It’s been a while since my last foray into making my own effects pedals. Most of my designs so far have been either very simple or overly complicated, at no bonus to tone. I decided to take it back to basics and build a single gain stage overdrive/booster pedal.

We are creating better datasets from jagged time series data!

Building datasets for machine learning is a time-consuming process. I find that I tend to spend about 70–80% of my time on a project preparing and cleaning the datasets. Often how I process the datasets follows a similar path. I filter my data down to only the relevant samples and features, and then I clean it. The cleaning process itself changes often depending on the data type, but a simple solution is to drop the dirty rows if you have enough data to have an adequate sample size afterwards.

Once the data is cleaned, I then start manipulating it to…

Patrick Coffey

Patrick Coffey is currently a Data Architect and an avid practitioner of software development and data analysis/visualization.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store