SA Data Science whizz who wants to solve Africa’s ‘tough problems’ with AI – Prof Vukosi Marivate

Vukosi Marivate moved to the United States for six years where he did a PhD in computer science and studied at Harvard University, but he decided to return to South Africa because he thought there was a lot of opportunity in Africa and South Africa to resolve some of its challenges with machine learning. Marivate told BizNews that he wanted to benefit society and “change the environment and allow people to do good work” with AI. In trying to resolve some of the challenges he co-founded a grass roots AI organisation called The Deep Learning Indaba that has annual conferences to bring the AI African community together and he co-founded a private research lab called Lelapa.ai to nurture a community of people working on AI on the African continent. Lelapa.ai had a seeding round of $2.5 million and have big name investors including Google’s AI Chief, Jeff Dean. Marivate wears many hats, his full-time job is chair of Data Science at the University of Pretoria where he runs the data science for Social Impact Research Group, he helped to create the Covid19 ZA dashboard which is still the only South African source of aggregated data on Covid19 in the country, and has now been selected by the World Economic Forum as a Young Global leader. –  Linda van Tilburg


Working with machine learning but caring about the people using the tools you create

‘You want to work in AI, you want to work in machine learning, but you care about other people using the tools that you create. So, it’s not just for yourself.  Oh, I built this thing; it’s very clever, and it does very interesting things, but it’s very hard for somebody else to actually use it. So, through that experience, I set a goal. When I finish this PhD, I really want to be in a position where I could be leading and building up a career within data science. So, I finished the PhD, came back to South Africa and then rejoined the CSIR. I had been at the CSIR before and led some of their data science initiatives. I also enjoy data science because I’m a person who has different itches and one is that I really like learning about the world through data. So, it’s not necessarily just for one area, but different areas and with data science, you’re always meeting new people who are trying to solve their problems whether it is in business, in science or in societal problems. Then you learn about what data they have and then you think what modelling could I do here, statistical machine learning or AI? So, it’s allowed me to keep that going and learn and work with a lot of different people. That’s one of the reasons I ended up at the Business school at Harvard University. I got the Harvard South Africa Fellowship and chose to do the leadership program there and learn about the business side. How do you then do things like communicate value, and estimate what value that might be for businesses and organisations and how data science comes into that?  

Read more: Creating a livelihood for street artists and more Kodak moments in Cape Town – Alex Tilman Baz-Art

Returning to South Africa and co-founding Deep Learning Indaba and Lelapa.ai

I was really bullish. I think there’s a lot of opportunity within South Africa and the African continent and there are ways that we could change the environment and allow people to do a lot of good work within these spaces. In trying to resolve some of the challenges, we co-founded a grassroots AI organisation called Deep Learning Indaba in 2016. It has held annual large conferences since 2017 that bring the AI African community together every year. This allows people who actually feel like they’re part of something, you don’t feel alone and you can then also take care of some of these wicked challenges and identify. That community has led to many, many outputs for the continent and people are also finding ways to stay and build wherever they are. In the same way, Lelapa.ai was also connected to that saying that if we’re now nurturing this community of people working on AI on the African continent, where do they go outside of the university system? You also have to increase how much R&D is done in the private sector or industry. With my co-founders, we came together and said let’s start a private research lab that will do R&D, build tools, release them, get people to use them and pay for using them. It creates a home that people can have on the continent instead of just leaving. Lelapa means home in Setswana. 

Big-name tech investors including Google’s Jeff Dean and a £2.5 million seeding round for Lelapa.ai

It was a very interesting journey. I’m an academic researcher, so having to switch your hat to how do you fundraise? That was a journey of over a year and yes, some part of it was looking for high-net-worth individuals, especially at the beginning, because you need to get people who could believe in our team of six co-founders. Four of them are academics, leading academics within their spaces, three are in South Africa, including me, one being a South African, but is based at Brown University in the US and then two are great data scientists and machine learning engineers, the CEO Jade Abbot and COO Pelonomic Moiloa who’ve built fantastic products before and now they’ve come in and said, let’s do this ourselves.. Then having to go out there and shop this around to people and say, please believe in us and that we can get some runway to show that you can come and solve some tough problems using good AI research. 

Read more: Exploring the hidden gems of Johannesburg: 10 unique stories

Building visibility for African languages in natural language processing in AI 

Jade, the CEO and I come from building another grassroots organisation called Masakhane that looks at African languages and AI. We’ve been doing research in that space and building up visibility for African natural language processing. Can we build products and do research and spin off tools that could be used? Some of the things are: How do you get into multilingual tools or services that are available in different sectors? If you think about South Africa it’s not just English or maybe sometimes it’s English and Afrikaans, but can people interact with different sectors in their language? Can they better understand better concepts within there and make it through automated systems? We are bringing together not just the technical engineers or the scientists. We work with linguists as well as part of our team so that we can build very good tools in isuZulu. Tools that can recognise what language you are trying to speak and then switch interfaces or switch so that you can use the language that you’re comfortable with.

Open sharing of data set at UP Data Science for Social Impact Research Group 

We’re looking at building tooling for low-resource African languages. These languages might have millions and millions of speakers, but they don’t have a lot of digital resources.We might take it for granted that you can easily access an English dictionary on the Internet, but for many languages in South Africa, you don’t have digital dictionaries that you can just get access to very easily. Just as one example. So the same thing is for text, the same thing is for speech formation and all those things. So, what we do in the lab and also in Lelapa and Masakhani is then figure out how do we get data, how do we make sure we improve the kind of machine learning approaches or AI approaches that they work better even if you have low resources so that they can still give you worthwhile feedback. So, we’ve been doing that throughout the years. We’ve built new datasets at the University of Pretoria that we share openly. We’ve got new tools and software that we’ve made available for people to be able to come and create other tools.

We’ve been very lucky to have grants from Meta, from Google that assists us in building some of these things for African languages. They said we’re interested in your work. Here’s some money to keep your research going and be able to see where it goes.

On the Lelapa side, it is then to say we can do this, but then sometimes you need to get access to some data that might be private, that cannot be in the same way as you could get inside the university. You need to work with partners who want you to see all the internals of their organisations and use those to better customise the tools. So, that becomes an outlet to spin out the R&D. It’s a different environment from what happens if you’re in a private company as well. 

Read more: Tech alone cannot create successful townships – Philippi village CEO Bushra Razack

We need to increase spending on Research and Development (R&D) in Africa

One thing I believe is that we need to increase R&D spending across the African continent. I think South Africa has R&D spend of something like 0.7% of GDP. If you think about the U.S. or so, it’s almost like three, close to 3% and we need to get there. That’s not necessarily going to come from the public sector or government that’s likely going to come from the private sector. With the WEF Young Global Leader  programme, one of the things I want to do is learn from my cohort and other YGLs. How do you generate excitement and real investments that are not necessarily for yourself? For me, it is about  investments for the whole community across the continent so that it creates an ecosystem that becomes sustainable. 

Creating the only aggregated source of Covid-19 data in South Africa 

When COVID happened, there was a lot of talk inside the lab. What should we do? How can we help? The thing we quickly identified in South Africa was there was no place where you could centrally get COVID aggregated information. The minister was releasing a report every day, municipalities or provinces would put up some infographic on saying what their number was but then those images and those reports are not data. They are not a nice spreadsheet that somebody can get and do analysis. What we ended up doing is leading a team of volunteers, I think over 70, who worked every day that if the minister said something, they would officially find where the minister said it, add it to the spreadsheets that we had and then cite it to say it comes from here, it comes from this province, it comes from this Premier and all those things, and add aggregate information of how many people had got COVID, how many had been cured, how many might have died. For the vaccine information, we did things like hospital resourcing, giving estimates of that and all those things. So it became this huge program because it allowed the right scientists, the ones who it’s their day-to-day work to do modelling and do epidemiological forecasting to just be have a place where they can just go click and download and do their work. That’s what the COVID 19 ZA project was about. It’s one of the outlets that you could see publicly just to show what was possible with the data we had was that you had a dashboard that you could go to and see, but the dashboard was not the goal. The goal was to have this data repository that you could get access to. In some ways, I like to say that even right now, in 2023, the repository that we built as a team of volunteers is still the only place you can actually go and download South African data because there’s still no official government source that’s open and that’s machine readable that you can just get basically a daily playback of everything that was going on from an aggregated way.

Visited 1,176 times, 1 visit(s) today