Dunking computer servers in liquid oil: SKA scientist breaks new cooling ground

JOHANNESBURG — The Square Kilometre Array (SKA) project is set to help scientists peer further into space than they’ve ever done before. The project, which is based in South Africa’s Northern Cape and Australia, also requires huge amounts of data processing. In fact, once it gets going, it will use up more data than that which is used across the globe on a daily basis. Keeping server equipment cool in this environment then will be a key challenge. But Aphiwe Hotele, a computer scientist working on the project, is testing a unique solution that involves dunking server electronics in oil to keep them cool. It’s a mind-boggling concept and well worth a listen. – Gareth van Zyl

With me on the line from Cape Town is Aphiwe Hotele. Aphiwe, you work for the Science Data Processor team at the Square Kilometre Array Project. Can you tell us exactly what you do there?

Yes, I joined the SKA in 2015 as part of a programme known as the Young Professionals Development Programme. I was one of the first students that taken up in the programme and, at the time, I think we were only five students that were selected for the programme. I joined the Science Data Processor Team with an interest in computer science because I had my Honours in Computer Science and I had indicated to them that I also wanted to get involved with a little bit of engineering. So, I was given a project where, at the time, the Science Data Processor Team was looking into finding a new technique to address issues such as cooling servers, trying to reduce cost and power consumption through reducing their cost in cooling.

They came up with this technique called “immersion cooling”, a process where you submerge electronics in oil in order to cool them down as opposed to the normal air-cooling where you actually use fans to direct the heat away from the cooling proponents. I was given this project and, at that time, it was still very early because no one really knew much about immersion cooling and how it was going to turn out for the SKA. I worked on looking at immersion cooling for the Science Data Processor team as a way to cool down our computers because with the amount of data that we get from the telescopes, another component that we have to worry about is not only processing the data, but also cooling the electronics that we use to process the data.

I worked on looking at immersion cooling and I developed a system, which I called the Environmental Monitoring System. This is entirely a system that monitors the temperature, the humidity, and the power in this immersion cooled environment. I also built a predictor which is a system that predicts temperature in locations where we don’t have a temperature sensor because in this immersion cooled environment it’s sort of like liquid, but then we use mineral oil as opposed to water because water conducts heat, so it’s not good for the electronics, therefore we chose oil.

I built this model using the Gaussian Process Regression, so it’s essentially a model where you are able to predict temperature in locations where you don’t have temperature sensors so that we don’t need to have thousands of temperature sensors in this tub. We can only use this model to say where we actually need to put in sensors so that we can put sensors in those locations where they are mostly needed, so that was my job at SKA. I decided to take it further and I did that in my Masters as well, which was very useful because I got to do a lot of literature reviews in that and got in deeper in the research aspect of the project, not only on the implementation part, so that’s also my Master’s Thesis.

To get back to the Square Kilometre Array, what you’ve spoken about, keeping those electronics cool, it ties into the Big Data project that the Square Kilometre Array entails. This is probably the biggest Big Data project in the world. Can you give us an idea of how much data is going to be processed and why it is so essential to cool down those electronics for that data?

At this moment, unfortunately, I cannot give you the actual numbers of the data that we’re going to process, but I do know that it’s very important for us to process those to actually cool the electronics down because, as you have said, the data that we are going to be processing a day is going to be more than the data that is actually processed by the entire internet on a daily basis. That in itself means that we’re going to need not only think about how we’re going to process the data because we actually have three challenges.

SKA project
SKA project.

We have to find a way to process the data, we have to find a way to store the data, and we also have to find a way to actually cool the data without using a lot of power. That is why this project of immersion cooling came about because they said one of the ways in which we can reduce power consumption is by either thinking about storage and processing or cooling. What I was specifically working on was thinking about how we can use cooling to try and reduce the cost and with immersion cooling. I’ve actually found out that it reduces about 70% of the power by using immersion cooling.

Instead of having big fans over these servers you can now use immersion cooling?

Yes, exactly. Because the thing is, the research shows that in a lot of data centres that use air-cooling, a lot of their power is consumed by the fans and not the actual electronics. Therefore, if you cut out all of those fans then you focus on the actual computer that is doing the computing and not the fans around it. Therefore, using immersion cooling actually it gives us a chance to focus on what we we’re really interested in, which is the actual computer and the actual computation.

So, that’s what was really interesting about immersion cooling, but obviously now there’s a downside to it. This is that it can be very messy because I actually implemented an immersion cooled environment, so I know that it’s quite a messy situation and also many people – for instance manufacturers for servers – are not very comfortable with you putting your electronics in oil. So, the moment you dunk it in oil, then you must know that there’s no warranty or guarantee or anything like that, it’s gone. However, with time, I think we’re also getting new manufacturers that are actually manufacturing new servers that are useful for immersion cooling specifically.

You’re dunking these electronics straight into oil. There’s a lot that must go into that.

Exactly and also, you need to think about the type of server that you need to use. For instance, you can’t use your hard drive; you cannot dunk a hard drive in oil. It’s just not going to work, so you have to change it to a solid-state drive if you want to dunk it in oil, but with SKA, what we’re working on with the Science Data Processing, is also coming up with a completely new board that we can use and that we will dunk in oil. So, that’s one of the projects that we’re also working on at the moment together with Intel.

Will that then be an Intel-powered board or would it be sort of a unique SKA product?

It will be an Intel board. There are many specifics to who’s going to own the programme, the actual project, but I know that it’s an SKA initiative. We came up with the board, particularly the Science Data Processor, came up with the board. But because obviously Intel is one of the biggest companies in electronics, we had to find out from them if they would be willing. I think it is something that is actually useful because one of the things that we look at SKA as well is innovation. That’s how this immersion cooling project also came about because we’re trying to find new ways to solve problems that have been there for such a long time.

Aphiwe, is this now a world-first technology that you’re developing here, or are you building on something that has already been out there?

No, immersion cooling is a concept that actually has been out there for quite a long time and I think Google has also started implementing immersion cooling. But it’s just that it’s something that has never been done in Africa or South Africa for that matter. Even overseas, the people who have done it, I only know of about three companies that have actually tried to attempt it. So, it’s still very new in the sense that many people, even the people that are implementing it, are still testing it out. It’s still very new, even globally.

It’s not something that many people are comfortable with even though you can say Intel is using it, but there are still many companies that need more proof to say, “Why do we need to use it”. It’s still very new in that sense, but the concept of immersion cooling has been there for such a long time, but it was put on hold, I think probably because people still didn’t feel very comfortable about dunking electronics in any form of liquid.

There would be a few risks with that as well then?

Yes. I completely understand where they are coming from, so you need to sort of stand in front of them and say, “This is why”. When I joined SKA I then went to UCT and I said, “I need a new supervisor and I’m going to work on this project. I remember the first time I spoke with the HOD, he was really shocked and he said, “I have never heard of anyone dunking electronics in liquid and I cannot guarantee that you are going to get a supervisor in this project”.

He sort of like said to me, “Maybe you need to go to mechanical engineering and see what they think” and I was like, “Yes, you might not think is not electronics, but actually the specifics of the project involve a lot of electronics, more than the mechanical aspect” and I had to do a presentation amongst the UCT supervisors to say to them: “This is why I think this is work”. And that’s how I got my supervisor. It shows that many people are still very uncomfortable about the concept.

What have you found so far. Is this something that can ultimately be used on the SKA project?

Yes, I definitely think that it’s something that can be used in the SKA project, but obviously they would need to think about it. One of the recommendations I gave is that we need to think about is how we’re going to deal with the issue that it’s actually a bit messy. So I think in that sense that’s the only thing that we need to think about and I still flirt with a few ideas of how we can try and limit that situation. Also looking at other bigger dealer centres that have actually implemented something similar to this and how they apply something like that. Without any doubt, it’s the way to go if we want to save power and costs as well, so I definitely think it would be a great way going forward.

When you say that it gets messy, how messy does it get? Do you have to, as an engineer, get in there and get your hands dirty?

If the system or one of the servers shuts down and it needs to be cleaned, someone has to physically go there and get their hands dirty in the oil and get the thing out. And if you get one drop of oil, before you know it the whole room is oily, so you need to be careful about it. So, from my experience, I think definitely one of the things that you really need to focus on is also: “How are we going to deal with this mess?”

Just so that we can paint a picture of how this would look. You would basically, I guess, drop individual server units into the oil itself? Would those then be lifted off the floor or how does that practically work?

The way that it works is exactly the way that you’re describing it. You’re literally dunking your switch, your server board in oil; literally, the whole thing is immersed in oil.

Wouldn’t this kind of research and development have many applications, potentially not just for the SKA, but for big tech companies as well? You think of Google with their massive data centres and IBM — all these other major tech companies out there. Are these other tech companies engaging with you or have they looked at what you’ve discovered so far?

Actually, no I think it was probably because I was still working on the project and I didn’t really have concrete results because in projects like this. I think if you’re going to deal with tech companies, you need to come to them with facts you know and say, “This is what I’ve done”. There are many companies that have come to see us. For instance, we’ve had people from Intel, we’ve had people from the IEEE that have come, but at the time, it was still incomplete. We were still busy with the project. Now that I’m done with it, I assume that the way forward is for me to go to people and say, “Okay, this is what I’ve done, these are my results and I really would recommend that you should consider this”. I think that’s probably the next step going forward.

You’ve said that you’ve also just completed a thesis, can you tell us a little bit more about that?

Yes, actually my thesis was about the system that I implemented. I focused on immersion cooling as a more viable option to air-cooling and I focused on the environmental monitoring system that I built as well as the immersion cooling temperature predictor that I built. I also did many experiments in immersion cooling to try to see whether we can detect a hotspot in an immersion cooled environment and we define a hotspot as a spot that is stagnant. There are two types of immersion cooling. There’s one-phase immersion cooling and two-phase immersion cooling. With one-phase immersion cooling you make use of a pump, but with two-phase immersion cooling you don’t use a pump at all, you use a special liquid, which works by evaporation.

Read also: Star gazing: World’s largest radio telescope establishes R50m University partnership

Through evaporation, that’s how the heat goes in and out, but then the two-phase immersion cooling, the oil itself is very expensive. We as SKA decided that we were not going to do two-phase immersion cooling because the kind of liquid that they use is very expensive. So, we focused on one-phase immersion cooling, which means that we’re going to make use of a pump. But now if you pump the liquid – because of the structure, because the servers themselves are placed vertically in the oil – it’s difficult for the oil to move around, so you might find a space where the oil is not moving at all. We call that a hotspot, but that is not going to be good for the electronics because if the heat gets too hot in that specific area then it’s going to cause a problem for the electronics. I did several experiments to find if it’s possible to detect a hotspot in such an environment. My thesis also presents on those experiments and if one wants to do that, how they can do it.

Now this is obviously your thesis for your Masters.

Yes, that’s my Master’s Thesis.

So you were a BSc student previously?

I did my BSc in Computer Science, then I did my Honours in Computer Science as well at the University of Fort Hare, and then I switched. At UCT, I did my Masters in Electrical Engineering, but focusing on Computer Engineering.

What prompted you to get into this field and landed you up to where you are today. It’s just incredible what you’re researching there.

To be honest with you, a lot of the time I say to people I don’t really think I have found my career; I think my career found me. Growing up, I wanted to be a medical doctor. I grew up in the villages in Queenstown and I didn’t have a maths and physical science teacher from grade 10 to 12. We would occasionally have temporary teachers and I really liked maths. So, when I finished matric, I had a C in mathematics, which I pretty much taught myself and then in physics I think I got a D or an E or something like that, but it wasn’t good enough for me to go into medicine. I don’t want to lie, that really brought me down. I was pretty much depressed for the first two months because I felt cheated by the system.

After I had worked so hard on my own, I still didn’t get to the point to where I got what I wanted. But in most cases I think God had a plan, and then after two months of being really depressed, I could see that it was weighing down on my parents because every day they would come home very early from school, from their work because they were worried that I would kill myself or something like that. I then just sat down with them and I said, “Okay, you know what, it’s fine. I will go to any institution just so that I can go to school” and that was their plan as well. They said, “No, we’ll pay for your fees, we’ll try our best to pay for your fees. Just go to any institution so that you can be amongst other students”.

SKA antennas at night

So, that’s when we went to the University of Fort Hare and they said to me, “No, you’re late and because you are late we don’t have space for you in mainstream BSc, you’re going to have to go to foundation” and that just made the situation worse because I was now going to do four years as opposed to three years and it just got me back to my depression mode because I was like, “No, this even worse”. But then I just decided, you know what, let me just go, I’ll just have to see what’s going to happen and at Fort Hare, they said, “You can do pre-medicine for two years and then you can apply at UNISA for medicine”. Then I was like, okay, that’s great. I went in to do that and that was in March.

I remember after a week I was in class, I was told that next week we’re having the test and I was behind obviously, so I had a lot of catching up to do, but at the end of the year, the nice thing about that programme, is that for the first year you actually get to do everything that they have in the department and that’s how I fell in love with computer science. I didn’t even know that computer science was a career. I used to see computers, but I didn’t know that there is a career in computer science or anything like that. So, by the end of the year when I had a meeting with my parents and they said to me, “Okay, you’re done with the first year”, I said to them, “I don’t want to do medicine anymore, I want to do computer science and I want to continue in computer science”.

Luckily, I had done so well to a point where at Fort Hare they have this thing of, if you get more than 75% in your modules then they pay you back the money that you paid, they call it a fee waiver. So, everything that my parents paid for was paid back to them that year and then the next year my math lecturer said to me, “You need to apply for this bursary” and I applied for the Industrial Development Corporation bursary and they’ve been funding me ever since. I often say, “I never really paid for my education or my parents never really paid for my education” because the IDC covered me up until I did my honours. Then when I was done was done with my honours and I was still at a place where I was like, “Okay, now what, what do I need to do, what am I going to do next year?” my computer science professor said they were accepting applications in YPDP, the Young Professionals Development Programme.

I applied for that and when I came to SKA, the first day was pretty much an introductory day where they took us around the office and that’s where I fell in love with computer engineering. Believe it or not, I didn’t know that computer engineering was a career, so I was really star struck that day to a point where I was like, “You know what, this is what I want to do. If these guys take me, this is what I’m going to do”. And luckily I was accepted and as soon as I got to SKA I said, “I want to start a career in computer engineering and see how that goes” and I’m happy to say that in December I’m going to be graduating as a computer engineer with my master’s degree. That’s why I say, I think my career found me, I didn’t find my career.

That’s a fascinating story. Looking at the whole debate around women in computer engineering, it’s not just a South African debate, but it’s a global debate that there are not enough women in the field or that there’s something wrong where women are seemingly not either attracted to the field. What do you make of all of that? You’ve obviously punched through that and you’re doing some very interesting stuff.

That’s very interesting, Gareth. I have a very different experience to many people in that with time I’ve noticed that it’s not the males that actually look down on you because you’re a female in this environment. I don’t want to lie, all the people that I’ve worked with, even from university, at university when I was doing my honours in computer science, we were only four females, and the rest were males. The males never, not even once, looked down on me.

It was other females who were not even doing computer science or were not even doing computer engineering that would come to you and say, “How long do you think you’re going to last in this environment?” There’s this notion, especially in the Xhosa culture, I don’t know about other cultures, but a woman’s place is in the kitchen. At the end of the day, even if you study one of the cases for such a long time, you’re not woman enough if you don’t get married at the end of the day. Many women would come to me and say, “How do you think you’re going to support your husband or anything like that with such a demanding degree or with such a demanding career?”

Artist’s impression of the 15m x 12m Offset Gregorian Antennas within the central core of the Square Kilometre Array.

So, a lot of the times, for me it was always women that would try to make me feel like I’ve made the worst decision of my life by going into engineering or science for that matter and they were not even attempting to do it because I assumed that they felt that they were not going to be able to take care of their homes or something like that. One thing that I’ve received from males, is that many women who are in science or engineering are not wife material because they tend to be very uptight, too confident and too stubborn. So that’s the only thing that I got from the male side, but a lot of discrimination I got from my other females, not from the males.

Many people always focus on, you know, “We need to get our males to accept our females”. But I keep on saying to people, “No, it’s not like that. We need to get our females to accept their females and accept their power and that they can do it”. Coming into this environment, I’ve actually met many wonderful women who are in engineering, who have families, who have children that they feed. Yes, it’s not an easy process, but they do it and they are happy, which means that other people can do it as well. But this stereotype that it’s not possible and it comes from females unfortunately, not even from the males from my experience.

Aphiwe, looking forward, what does the future hold for you now that you’ve been through this whole process? And with some of the experience that you’ve now gained from the SKA, are you going to work with them for in the future or are you going to possibly move onto something else in five or ten years’ time?

At this moment, actually I’m still at a point in my life where I’m putting all the cards on the table and trying to see where to from now because for the past three months I was in the States. I went to the National Radio Astronomy Observatory for an exchange programme where we did a lot of radio astronomy computation, we also worked on project management and I got my Project Management certification. We worked at system engineering and leadership. So, that stuff opened doors for me on doable level because they were really interested in working with me in the future. So, that’s one of the things that’s also on the table, but I’m very much invested as well in South African education. Ever since I joined the SKA, they have an outreach at SKA where people, actually employees are allowed to go out and do outreach.

Through this, I started a programme called IMBASA. Initially it started as a Science Data Processor team thing because I was chosen to lead the outreach team, so I started this programme which seeks to motivate students to do math and science from previously disadvantaged communities — communities that are similar to the community that I come from. Because even when I was growing up, I used to notice that many students don’t do maths and science and it’s because they feel like they’re not worthy or it’s not something that is for them. We do this by taking role models. We identify science role models; we call them “Science Celebrities”.

These are people that they can relate to, people that come from similar communities to them and these people will go to them and they motivate them to do math and science and they talk to them. But with time we noticed that this is not enough because what happens once the students are motivated? Then we chipped in with tutoring. We first motivate these students when they are doing grade eight and then when they start grade ten, we offer tutoring, math and science tutors. I managed to get employees from SKA to go out to the school to tutor students in math and science. They volunteer at their own will and I don’t pay for their transport or anything, they pay for their own transport to go to the school. That’s why it’s been going really great and then we also found out that was not enough because once these students go through the tutoring programme, there’s still something else.

When they get to grade 12, they don’t have financial support to go to school because many of the students from the previously disadvantaged communities depend on NSFAS and NSFAS only kicks in in March like the food allowance and meal allowance and stuff like that, so many students drop out of school during the first three months because you moved from home to start off with to go to this institution and your parents didn’t have money to support you, so how are you going to a new place where you know nobody and you have to find your way around and also you’re going around with an empty stomach. That’s too much to bear for a first year student.

Then I started this grant where employees from SKA donate funds to this grant and we give the students money for them to register at the beginning of the year and also food and transport for the first three months so that at least they can just get through university without their first three months, which I think are very critical for a first year student, at least for them to be at ease so that they can focus more on the environment and getting used to this new environment as opposed to them worrying about food and transport and stuff like that. So now we’ve also opened that fund to the public, so anyone who is interested in donating to this grant, it’s open to the public now, it’s not only for the SKA.

Flag map of South Africa

I am interested in working in the States, but at the same time I’m very invested in trying to improve the education system in South Africa because I see a lot of potential in South Africa, particularly in previously disadvantaged communities because we sort of like find our way to at least get through matric and we actually pass matric. Then because companies are looking for straight A students, we’re obviously not going to be straight A students because we’ve been teaching ourselves. That’s what I said to my parents, “The day I would get someone who would teach me maths and science, I would get 100%” and that’s what I did at university.

There was a point when I got 100% in my tests and my maths teacher couldn’t believe it and I said, “It’s not because I can’t do it, it’s just that I need the necessary resources that I don’t have for me to do it”. But then there’s a whole pool of these people from previously disadvantaged communities that are just like me who have this potential but they don’t have the necessary resources, but no one ever gets to them because the system only stops and says, “We’re looking for top students and that’s it and it stops there”. No one ever wants to go down and look deeper than that because the problem is actually deeper, we are there and many of them that are crying out to say, “Listen to us, we exist, but automatically the system just cuts them out”. That’s why I’m saying, this programme is particularly focused on students like that so that we can try and rescue the few students that we can rescue.

Can the public donate to the fund that you were talking about as well? Are there any details that they can follow up on or a website?

We’re working on setting up a website where people can donate. At the moment, we only have an account where people can deposit funds, but I’d advise anyone who’s interested to call the SKA, the SKA number and they just look for Aphiwe.

[Tel: +27 (0)21 506 7300]

Okay, we’ll definitely put that contact detail on our post as well. Aphiwe, it’s been an absolute pleasure talking to you, fascinating stuff, thanks a lot for taking the time to chat to us today on Biznews.

Thank you so much, Gareth, I really appreciate it.

Great.

It’s been great talking to you.

(Visited 26 times, 6 visits today)