computer science and global warming

 


I’ve spent the last five years working on a new project called “Solving for Greenhouse Gases Using Machine Learning Models”, which will be a part of my Ph.D. program in the fall of 2019 at Harvard University. It is my first thesis, so there is one big difference between writing something that you hope to publish and my goal here trying to solve greenhouse gases using machine learning models. However, the general idea of applying these types of techniques to solving problems of this sort seemed like an obvious first step after my dissertation.

Machine Learning (ML) is a broad term that covers a wide variety of different algorithms and tasks and it’s often used when talking about the prediction and classification tasks involving datasets that are labeled with known values or characteristics. For example, text analytics can be considered a task within machine learning to try and find patterns in your data by giving examples that are similar to the ones you have already seen and making predictions about them based on those patterns. But in terms of things like climate change, these methods are most useful for finding trends and correlations, which is important for understanding what changes are happening over time.

I’m interested in looking at how we could extract from the historical records and past climate models of various countries that will help us understand the amount of carbon dioxide in our air, and understand the current rate of change as well as any trends in temperature over history.

Here is where Google came into play. They have been able to create an application that makes some basic predictions to see if they should be concerned about rising levels of CO2. Essentially, it is just a system that allows people to input data from weather models like this one and Google will generate what seems to be fairly accurate predictions based on that information. But let’s talk about why ML can also help make such accurate projections. First, it helps gather more past models which makes a much better interpretation of what happened compared to other methods like simple statistics (e.g., simply counting the days since 1989 when carbon dioxide was not allowed in the atmosphere). The second reason is the ability to compare future models from now on to models from long before and even pre-pandemic times to better see which model might be reliable in predicting where the planet might be headed, especially if the trend of previous models matches up. Let me give the background about the topic:

1) Many scientists use models about climate change to describe how much CO² matters in the atmosphere. This is also done by climate models, but they describe what data is necessary to build these models, what data will be analyzed, etc. The problem is that many of these models can be very unreliable due to flaws in the assumptions and models themselves and some of these models aren’t even accurate representations of reality, to begin with as there are millions of factors involved in climate change. One of the most common ways to account for any lack of quality in this approach comes in the form of statistical sampling. After all, data can only be collected by measuring the target variable in a sample, so the variables can be split into groups to analyze. Sometimes, however, sampling isn’t quite accurate. In real life, samples can’t exactly represent all aspects of reality, and we can do better with data that is representative of the whole system, and there are usually multiple scenarios and possible values of things we can measure to get more realistic data. This technique has become increasingly popular in fields like a business where large amounts of data and more samples represent everything to be evaluated, not just the thing being tested.

This is the method or practice where the researcher takes multiple sets of data and builds these new data as a representation of reality. To do so, the variables can be split up to make a random set and another one, etc. These models are then built on top of historical predictions about carbon dioxide and these models can’t be said to be true without having a good sense of their uncertainties. A way to fix the errors in these models is to collect more historical samples and test them to see how accurate they are.

2) How would know more about historical temperatures change our knowledge of the climate?

If you think about it, how much impact weather models have had on our understanding of temperature, you can start to realize how much our understanding of history depends on being able to look back into ancient times. Imagine if our climate had been more consistent over the last 5,000 years, would we still know where Earth is now and why are we seeing record snowfalls even though the ice caps haven’t melted thus far? That would be interesting, wouldn’t it? We can get very close to understanding what our current state might be like if we were able to trace all these historical events back to the present day. Or so I like to think, right? Well, maybe there is a couple of things that would come out of looking back through the archives, but it would be worth exploring.

3) I think I’ll write an article about what is likely going to happen to carbon emissions if we have to go back another 7,200 years and how it might change our understanding. Maybe an updated version of the IPCC report that explains better how humans may have played a role in the transition and hopefully will provide insight as to how we think about things in the upcoming?


Comments