Out with the (c)old, in with the new Recently I needed to look into time-to-live (TTL) capabilities within ElasticSearch. It has been several years since I used this feature (version 1.5, to precise – aeons ago!) Unsurprisingly, things have changed significantly since. Instead of setting _ttl in the index mapping
Read moreAndrew Kenworthy
Introduction In an earlier article I mentioned briefly some possibilities for bringing python-trained machine learning models into production. Specifically, a java environment. In this article I will take a closer look at how this can done and what pitfalls we should avoid. Keras In the words of the Keras website,
Read moreIntroduction When we work with classification or regression models we are seeking to predict either a discrete value, or a value along a continuum. The accuracy of these models is implicit in the metric we use when training: either in the cost function during the actual training, or something like
Read moreIntroduction Why Dask? For those of us who regularly work with python machine learning libraries, the Pandas DataFrame library is a firm fixture in our toolkit. Pandas DataFrames allow fast and efficient manipulation of data and a host of data wrangling functions. And the processing is fast because pandas does
Read moreIntroduction This is the next article in my collection of blogs on anomaly detection. This time we will be taking a look at unsupervised learning using the Isolation Forest algorithm for outlier detection. I’ve mentioned this before, but this time we will look at some of the details more closely.
Read moreWe can all think of examples of successful forecasting, but what should we be aware of and what pointers can we adopt going forward?
Read moreNot all models need to be pre-trained. Sometimes it is more effective to apply algorithms inline to small batches of data.
Read moreThe M-competitions compare and evaluate different approaches to, and implementations of, time-series forecasting. Here is a brief review of the latest one, M5
Read moreWe must give consideration to the languages used in model training and model deployment – and we should do this before any model work begins. It is better to consider the two environments (and sometimes the two teams) as a whole and then work to a common interface.
Read moreIntroduction Unsupervised anomaly detection with unlabeled data – is it possible to detect outliers when all we have is a set of uncommented, context-free signals? The short answer is, yes – this is the essence of how one deals with network intrusion, fraud, and other types of low-instance anomaly. In
Read more