Amanda Hering

Season 3 - Episode 310

March 6, 2020

Amanda Hering
Amanda Hering

Dr. Amanda Hering applies her statistical expertise to a “once in a generation effort” to improve water treatment in the U.S. Hering, associate professor of statistical science, is an internationally-recognized leader in data sciences and their intersection with the environment. In this Baylor Connections, she shares about how data can help us utilize new paradigms of water treatment and delivery and examines the need for students trained in the data sciences in the years ahead.

Transcript

Derek Smith:

Hello and welcome to Baylor Connections, a conversation series with the people shaping our future. Each week we go in depth with Baylor leaders, professors and more discussing important topics in higher education, research and student life. I'm Derek Smith and our guest today is Dr. Amanda Hering. Hering is part of the National Alliance for Water Innovation. A team comprised of faculty members and researchers from leading universities and national labs that last fall was awarded a $100 million grant from the US Department of Energy to transform the US water system through desalination. Herings research focuses on ways to use data to better operate decentralized water treatment plants. Additionally, she received a National Science Foundation grant last year to create a data sciences core curriculum and undergraduate research program at Baylor, a model for providing data sciences instruction to students from a variety of disciplines. Lot going on there with your work these days.

Amanda Hering:

It's been a busy year. Thank you for having me.

Derek Smith:

Thank you, well, it is great to have you here on the program. And let's start out with a little bit of a 30,000 foot view that when we talk about data sciences or people have listen to show and heard that we've said that data sciences are one of the five signature academic initiatives in illuminate. What are we talking about more broadly?

Amanda Hering:

Right. There's been a huge proliferation of data in the world and a lot of industries that maybe historically have not collected data or are beginning to collect it and store it and they want to mine that data for information so they can make better decisions. There's been a big discussion, I would say in both the statistics, computer science, mathematics fields about what data science is because really it's a blend of all of those disciplines along with some sort of subject matter expertise. There's some discipline, some research field where they're collecting the data and maybe don't know what to do with it, don't know what the best way to model it or to analyze it is. They have questions about how to store it, how to maintain their data, how to keep it secure. We hear about data breaches all the time at major credit card companies or banks. All of that is bound up in our digital signatures and data that's being collected every day. To me, in teaching this Introduction to Data Science class, which I know we'll talk about.

Derek Smith:

Mm-hmm (affirmative).

Amanda Hering:

Has sort of forced me to take a really close look at what data science is. And for me it really is that blend of computer science statistics and subject matter expertise that are all combined to answer some important questions with data.

Derek Smith:

Mm-hmm (affirmative). And just from hearing you talk, is this something that's really growing beyond what we would think of when you think about students who are going to be part of the future workforce? Maybe not every student's going to need data sciences, but just how much is that expanding as you mentioned into fields maybe you don't traditionally think of?

Amanda Hering:

Yeah, I think that there've been a few studies lately that even students who maybe don't have a major in data science or statistics or computer science, if they can just get some training in that field regardless of what their discipline is, they can be a tremendous asset to their employer. And those students who come out with some data science training in their degree often wind up with a higher salary than other students do. I think it's something that, it's a pervasive part of our world and those people who are proficient with it and are comfortable with it will ultimately, I think be a real asset to their employers.

Derek Smith:

Now you're an expert in your field, but you're always learning more because it's rapidly changing. Can you give us a sense of just how rapidly changing it is and probably how much even you, as a professor just kind of working to stay on top of trends?

Amanda Hering:

Yeah, I think that the concepts have existed for a long time but in today's world, in the last say five to 10 years, people are really beginning to put them all together to use, to extract information from maybe these new and novel datasets. I think that the demand for data analysis is greater now than it's been in the past. As far as the... A lot of the core concepts needed in data science had been around for a really long time.

Derek Smith:

Visiting with Dr. Amanda Hering, Associate Professor of Statistical Science at Baylor. What are some of the most prevalent issues in the data science field right now? Maybe challenges, opportunities that you talk about with your students?

Amanda Hering:

I think that some of the challenges in the field are that people are trying to define what the roles and responsibilities are for someone who has this title or this label of data scientists. What are the best job titles and descriptions, because if you are looking for a job for a data scientist, it could be someone who's more on the data management end of things. Yu might need more of a computer scientist or it could be someone more on the data analysis side of things where you might want more statistical skills. I think that that all of that is sort of influx at the moment. There's not a standard set of definitions or vocabulary yet in terms of this is what we expect someone with this job title will do. And I think in academia the struggle is trying to decide and a lot of universities have already made this decision, but do we need a degree that's specifically dedicated to data science? For example, here at Baylor we've got a lot of courses in computer science, a lot of courses in management information systems, a lot of courses in statistics, and even in economics. We've got these sort of pockets of data analysis flavored courses. But do we need to sort of put them all together and organize them into a degree? I think these are the sorts of questions that people are wrestling with right now.

Derek Smith:

Talking with Dr. Amanda Hering and Dr. Hering we can show an application of this now that that ties into the work you've done through decentralized water treatments and desalination later on. But you eventually joined the faculty at the Colorado School of Mines and found the intersection of those two things through work on decentralized water treatment plants. Could you take us on... Give us a little tour of what those are?

Amanda Hering:

Yeah, so my very good friend and colleague Tzahi Cath, he is an environmental engineer at Colorado School of Mines, one of his industry partners, Aqua Aerobic donated a small decentralized wastewater treatment system to his program. And they installed it there on campus so that it is treating water from an apartment complex on campus. He's sort of the one that drew me into this area and got me interested in it. And it's just so compelling now because water is so fundamental to life and lots of communities across the country and around the world are struggling with having the right amount of water in the right quality at the right time. This is one potential solution for those water stress communities. And it's not necessarily that all of these water stress communities are in places where there's very little precipitation. We think about the Arid West as being in California, these are being places that really are water stress communities, but it's places on the East coast as well. Atlanta, Florida, there's all sorts of issues, Michigan. There are issues across the country and around the world, even when you're in a place that has high precipitation. So that's not the only barrier to... Or the only predictor for the community being water stressed.

Derek Smith:

How could a decentralized wastewater treatment plant de-stress that? And maybe particularly as it relates to maybe what we think of as the large municipal water plant in a city?

Amanda Hering:

Right. The traditional paradigm for a lot of communities is water gets pumped into homes and businesses, it gets used and then it all goes into the wastewater pipes and it all gets pumped to a centralized water treatment facility. And the pumping to that location takes money, and then once it's there, they treat the water and it gets released back into the right. It's like you're sort of using your water once and you're throwing it away. The idea with a decentralized wastewater treatment facility is instead of pumping the water to some centralized location, which might be far away, let's collect it close to the source where it is generated, treat it, and potentially reuse it locally so you could reuse it for watering the grass or irrigating crops or washing your car or flushing your toilet. Do we need to treat all water to the same standard? Within the US it's not that water from a decentralized waste water treatment facility would replace all of your current water sources, but you would add it to a community sort of portfolio of all their sources, groundwater, surface water and it just gives communities I think a little bit more flexibility.

Derek Smith:

And you've mentioned before when you think about staffing, wouldn't have the staff of a large municipal water treatment plant. How does the data in terms of sensors help stop problems before they get severe?

Amanda Hering:

Right. In a centralized wastewater treatment facility, they have operators onsite 24 seven so there's constantly someone there. If a pump goes out they can fix it immediately, they can monitor all the time, all the different water quality parameters. They're always going out and taking grab samples and so there's just someone physically there. But if you imagine this paradigm where you have lots of small, decentralized facilities, you may not be able to have an operator on site and at every single one of them. The idea would be instead, you would have one operator who can remotely monitor multiples of these decentralized facilities simultaneously. And because they're smaller, they get smaller batches of water that come into them, whereas the centralized facility, they get all the water from everyone and it gets all sort of mixed up. If you have a bad batch of water, someone flushes some sort of really strong chemical down the toilet, a flushing event, that's going to have a much different impact on a decentralized facility than it would on a centralized facility. With a decentralized facility, these tend to have not only these advanced membranes that separate water from solids, but they often will have a biological community that does biological nitrogen removal. These are bacteria. They eat different constituents in the water and degrade all the nasty stuff that we want to get out. But these biological communities can be sensitive to different constituents in the water that's coming in. If you get something that's really horrible, then it could potentially kill that biological community and it can take a couple of months for it to recover. With sensor data, the idea is that you have lots of different features that are monitoring the plant that are being recorded rapidly over time. For example, every one minute is the kind of data that we're dealing with and they're monitoring features that we can use to map back to this particular feature is changing, that could potentially affect the biological community that's treating the wastewater or this particular feature is changing, so that might indicate that we have a clog in a pipe somewhere or we have a malfunctioning blower. And we can often use the data to see those sorts of problems in advance of even when an operator might be able to tell visually based on the data that there's an issue. We collect lots of different types of features over time and we try to use that, ingest it all and hopefully collect some data or a lot of data under normal operating conditions, this is what we expect the values of all these features to be when the plant is operating normally. And then when something goes out of whack or there's some sort of problem than we can detect it by comparing the new data that's coming online to what has already been observed historically.

Derek Smith:

How challenging is it or what needs to be done to determine, I think this would be not just for decentralized water treatment plants, but really all kinds of businesses who have a lot of data to determine what we sift through in the data that really is significant?

Amanda Hering:

Yeah, it's tough. 30 years ago we had small datasets and the challenge was getting any useful information out of a... Detecting any sort of pattern at all. And now with large data sets, the danger is that you detect patterns that don't matter. You can detect really fine scale differences among particular features and that's something that we really pay attention to. In our decentralized wastewater treatment facility context, the danger would be that we issue a flag that there a problem and there's not one, those are called false alarms. And if you get a lot of those false alarms, it's like crying wolf. After awhile the operator just begins to ignore those. We don't want a lot of false alarms where you're issuing signals that are not needed. But then there's also this sensitivity where you do want to capture and issue flags when there is a problem. It's a bit of a fine line, you want a method that isn't going to issue false alarms but is sensitive to issue an alarm when there actually is a fault. When we're developing new methods, those are some of the metrics that we look at to see how a method performed. We look at how many false positive does it issue, how rapidly does it identify a fault when there is one. We study a lot of that stuff with artificial data before we implement it in real life.

Derek Smith:

Now we mentioned at the top of the show that you are a part of the National Alliance for Water and Innovation, $100 million grant from the Department of Energy, last fall a competitive grant, so there were other people vying for this, the team that you're a part of with other leading universities and national research labs won that, and I know that's very exciting.

Amanda Hering:

Yes.

Derek Smith:

And can you tell us a little about how... What aspects of what we've just been talking about play into that and how this extends that even further?

Amanda Hering:

Yeah, so again, my good colleague, Tzahi Cath got me involved with this and he and I have been funded through a couple of different National Science Foundation Grants up to now. One was an Engineering Research Center and the next was... It's called a Partnership for Innovation Business Industry Collaboration, so two two of these NSF awards. And so we've been working together for a while and when this particular opportunity arose, the Department of Energy began to organize workshops to sort of rally all of the academics and researchers in the US who do water treatment research and that was about two or three years ago they started. People began to self organize onto teams and DOE was promising we're going to release this call for proposals that's going be really large where we're going to try to get people to focus on the nexus between energy and water treatment. Yeah, I just started going to the meetings and networking with other people in the group. And like you said, it's a large group and I think there were probably about four or five other teams who organized separately. Our team is comprised of three founding national labs, 19 universities and 10 industry partners.

Derek Smith:

Wow.

Amanda Hering:

Those are sort of the ones who are there from the beginning. But I think part of what NAWHI wants to do now that they've been selected for this award is sort of open the door to anyone in the US who has an interest and an expertise in water treatment and some aspect of water treatment and invite everyone to come in and do this research and do this work. They want the best of the best to do it. Right?

Derek Smith:

That's good. What is desalination and what are the Alliance's goals?

Amanda Hering:

Yeah, desalination is just... There's multiple ways that it can be performed, but it's a process that separates salts and minerals from water. In the US at least on a large scale, the way that most people do it is with reverse osmosis. And that just uses a semi-permanent permeable membrane. And they use pressure to induce water permeation through the membrane and reject salts and that requires a lot of energy. That tends to be a barrier from more people using desalinization sort of broadly. There are some desalinization facilities in the US but the water that's produced by those facilities is more expensive than say, treating groundwater or surface water to drinking water standards. Now, his goal is to try to get water treatment technologies to the point where they're secure and affordable and energy efficient and that we can use non-traditional sources of water. So not just your typical surface water but ocean water, brackish groundwater, brackish groundwater is just salty ground water, agricultural runoff or produced water from industry or say oil and gas industry. There's a lot of different opportunities and there's a really... That's why it's such a big award is it's a really big problem because you have all different types of water that you start off with and you might want to treat them to different levels of quality. You have lots of different beginning points and lots of different end points, and how you get from one... From your beginning point to your end point just may vary depending on what treatment technologies you have available to you and in the cost effectiveness of those. This is just a once in a generation investment by the United States improving water treatment technologies.

Derek Smith:

Over a five year period of time?

Amanda Hering:

Yes. It's initially planned or slated to be over a five year period of time.

Derek Smith:

Obviously, you've laid a lot of the groundwork for this. What does the beginning of that process look like?

Amanda Hering:

I'm the site director at Baylor and I can officially say that now the contracts have been signed between the Department of Energy and our Alliance as of February 12th and my role is really twofold. One, I'm a researcher so I'm on a first year project where we're looking at integrating data from some water treatment technology system to use in controlling optimally how that system is run. But on the other side of that, I'm the site director at Baylor, which means that I'll communicate to the Baylor Faculty and Administration anything that's going on with NAWHI he so that people can be aware of what opportunities might exist within the Alliance.

Derek Smith:

We are visiting with Dr. Amanda Hering here on Baylor Connections heading into the final couple of moments on the program. We'll look forward to hearing more about that as a lot of that work that continues in earnest, but I want to close by asking you briefly another grant. We talk about these things using data sciences, whether it's for water treatment, the environment, a number of areas. Big part of building on that is training the next generation. You received a $1.2 Million NSF Grant to establish data sciences curriculum and undergraduate summer research program here at Baylor. What are your hopes for that class?

Amanda Hering:

Yeah, the class is called Introduction to Data Science and there's no prerequisites. My hopes are that lots of students take it. And my hope is that when they take it, they're really inspired by all that they can do with data science and that they're inspired them to go on and take additional courses in statistics or computer science so they can either at a minimum, maybe minor in one of those disciplines so that they can build their resume. I also hope that this is a cross disciplinary between statistics and computer science faculty at Baylor and at Colorado School of Mines. And I hope that as faculty we can learn to work together and sort of speak each other's language because oftentimes what I mean by data science when I say that is different than what someone else might mean when they say data science. This is a good opportunity for us to learn more about what each other considers to be data science and how we approach teaching that in the classroom.

Derek Smith:

Well that's exciting, and look forward to seeing that grow. And we appreciate you sharing that and sharing your time with us today. Dr. Amanda Hering, Associate Professor of Statistical Science with us today on Baylor Connections. I'm Derek Smith reminding you can hear this and other programs online, baylor.edu/connections. Thanks for joining us here on Baylor Connections.