Machine learning is incredibly difficult. Many people believe that technology is almost omnipotent – ​​capable of attacking any problem with ease – but often forget to factor in the effort required to produce useful, robust and unbiased results.

- Advertisement -

Often, the time taken to implement machine learning is not building up to the basics of machine learning, as most of the fundamentals are well solved and available. The difficulty is in preparing the data and adapting the model to be useful for real, real-world examples.

It can take months and millions of dollars to implement a truly effective machine learning model. The process typically involves incredibly lengthy and complex calculations, running either on supercomputers or on a massive grid of computers distributed on a cloud architecture. To be truly read, endless variations of the same problem must be run through the system to try and get better and better results through key metrics.


I’ve seen these difficulties the first time – twice. First, building automated trading strategies using forecasting tools at a crypto hedge fund, and second, at Google, helping with pipelining for machine learning data and annotations. These experiences have helped me learn that it is impossible to produce good machine learning without the requisite work and effort, in the same way that it is equally difficult to hire without a focus on fairness.

Most of the time, simple solutions are more useful than investing in machine learning.

The perfectly optimal answer may be only marginally better than a trivial answer – which will often suffice, especially when taking into account the cost of training a model, you’ll end up with a lot more than a human making an educated guess. may deteriorate.

- Advertisement -

However, for very difficult problems where marginally better answers can make a substantial impact, machine learning is an invaluable tool that can have a massive impact on overall results. Machine learning is an umbrella term for a large variety of different strategies that will work for different problems with varying levels of success.

For example, genetic algorithms are a popular strategy that work well in a variety of situations. During each iteration, all candidates are re-measured for success. This approach mimics Darwinian evolution by taking the most successful (defined per problem) candidates out of a larger pool and creating new candidates combining these.

In technical terms, in each repetition or epoch, each individual candidate is tested for his heuristic strength or fitness. For example, imagine that a baker is trying to find a recipe for the perfect brownie. In their first batches, they randomly change the ingredients. After baking these slightly different batches of brownies, they flavor each batch. Then, they take several great-tasting brownie recipes and mix those recipes together to create more recipes until the baker ends up with only the best-tasting brownie recipes.

Machine learning can be incredibly biased if not controlled.

This evolutionary strategy comes with a major flaw that is often problematic: the trend of diversity toward zero. In each generation of the system, the most successful candidate wins more often, resulting in a higher chance of being one of the items chosen for the next generation. This can compound over time to always move towards more and more similar candidates.

a lack of biodiversity Ecosystems are well understood as problematic, and AI is vulnerable to similar issues. For example, to automate the labor-intensive hiring process, Amazon accidentally developed a The Failed Resume Screener Who Likes Men Too, punishing graduates of women’s colleges and members of clubs with the word “women”. How did this happen? Algorithms, especially AI-based solutions, are completely dependent on the data provided to train them. When algorithms are trained on successful hires consisting largely of males, these biases are reproduced in the “objective” model.

These diversity issues are so prevalent in genetic algorithms that reducing bias is one of the most valuable uses of time to produce better results. We can improve genetic overfitting by introducing randomness and heterogeneity into the system, giving random objects a chance to become more prominent for a moment and see if they produce quality results, or have the potential to dominate it. By introducing completely diverse new item to chance.

We understand how to reduce bias in AI through rigorous empirical data, and we can take these lessons and apply them to society by reintroducing diversity into our corporate and academic populations. female makeup under 25 Number of tech roles in America’s biggest tech companies, and the number is even lower for women ~22% of all AI jobs. When this happens in machine learning, we introduce more diversity and make sure no group is under-represented, indicating the effort required by the people behind the screen.

Encouraging unique opinions keeps creativity from getting lost.

For this reason, it is mathematically important that we correct these imbalances and restore balance in the job market by guaranteeing representation of people from all backgrounds in all types of job markets. Gender and racial inequalities need to be addressed and are inherently dangerous to the overall success of the system.

Taking the lessons we learn from machine learning and genetic diversity, we can empirically guarantee that bringing more diverse candidates into high-profile openings is incredibly important, both to encourage diversity of ideas and To ensure that the next generation of leaders is not biased. towards certain groups. These problems are not easily rectified and a major effort will be needed to shift the overall model towards a purely meritocratic one.