About PhD in Data Science
The field of data science is emerging as a critical discipline with high relevance to economic growth and development. This doctoral training program established by AIMS will provide emerging African scientists the opportunity to conduct research at the forefront of data science, and work towards a PhD degree within a high-quality training program in Africa, in cooperation with institutions internationally.
The program will focus on theoretical foundations of data science as well as applications of data science to improve the daily lives of Africans. It is built on the understanding that modern approaches in data science require a combination of expertise spanning the areas of mathematics, statistics, computer science, and the applied sciences.
AIMS will be offering up to seven fully-funded PhD positions in this prestigious new doctoral program. The recruited students will be based in Rwanda at AIMS Rwanda, or any of the other AIMS centers, in partnership with universities and research institutions across Africa and globally. The program aims to train future change-makers, who will have an impact across academia, industry, education, and government.
Candidates can choose from a list of proposed research topics, and AIMS will assist in building a supervision team around these topics. Alternatively, candidates can suggest their own research topics, together with a proposed supervision team. Selected students will start in Oct/Nov 2021.
- Master’s degree (completed by Sept 2021) in mathematics, statistics, computer science, engineering, physics or other relevant fields;
- Sufficient theoretical foundations evidenced by prior work (courses/thesis/other training);
- Qualification for pursuing research on the chosen topic, including relevant programming expertise;
- Research potential evidenced by academic performance and involvement in relevant academic activities;
- Motivation for pursuing a PhD by research in the suggested topic;
- Being an African national.
- 7 positions available
- Length of program: 3 years, with possible extension to a 4th year
- Fully funded (stipend, equipment, health insurance, relocation costs, conference attendance, direct cost to graduating institution such as tuition fees and registration fees)
- International supervision teams from well-known research institutions
- Research topics that push the boundaries of data science
- Program start: Oct 2021
Applications are now Closed!!!
The selection process is competitive, and conducted in two phases:
You will be asked to submit the following documents
- Application Form
- CV (you can use your own format, but please make sure to cover at least all items mentioned in this template that apply to your case)
- Transcripts (Please submit both your undergraduate and your masters level transcripts. Additional transcripts can be submitted if relevant. )
As part of the application form, you will be able to
- Select or suggest a research topic
- Name two academics (ideally senior researchers), who are willing to write a letter of support
- Indicate whether you already are admitted at a university or otherwise have future plans that you consider to combine with this program
Additionally, the application form will allow you to tell us more about your research interests, motivation for pursuing a PhD and future plans. Your answers to the following questions will be central to the selection process. We advise that you prepare your answers offline with care.
- What is your motivation to pursue a PhD? Here, you can also mention plans for your future career. (1500 characters)
- Which research directions are you most interested in and why? Justify why you are qualified to pursue research in this area. Here, you can also comment on your reason for choosing the research topics selected above (2500 characters).
By sharing these details, we are able to better support your individual situation and work together to design a PhD program that fits your circumstances.
Deadline to apply (Phase 1): 1 June 2021
A small number of shortlisted candidates will be invited to round 2 of the selection process. In Phase 2, supervision teams are formed and applicants discuss with potential supervisors more details on their research plans. As part of Phase 2, applicants are asked to submit a detailed research proposal.
Remarks on submitting your application
- You will be asked to select one main research topic (or suggest your own); there is the possibility to indicate a second topic choice (or suggest your own).
- Make sure that you satisfy the required background and skills when selecting a research topic.
- Arrangements are flexible and we can work with selected candidates to adapt to individual circumstances.
- No reference letters are required in Phase 1 of the application process. Please notify your referees that you are submitting their names as part of this application as they may be contacted for letters or additional information as part of the selection process
- Make sure that your CV covers as a minimum all items provided in this template (if they apply to your situation).
Women applicants are encouraged to apply.Apply Online
Applications are now being accepted.
Questions should be directed to firstname.lastname@example.org
The new Doctoral Training Program in Data Science (DTP-DS) is established by Quantum Leap Africa (QLA) at the African Institute for Mathematical Sciences (AIMS) Rwanda in collaboration with top researchers from across the globe. Here, you can find more information on the following:
- Enrollment & Graduation
- Research Topics
- Training Components
Enrollment & Graduation
Candidates that are accepted into the program will be enrolled in two institutions:
- One of the five AIMS Centers of Excellence (Rwanda, Ghana, Cameroon, Senegal, South Africa)
- A higher education institution (generally in Africa) partnering with AIMS
Candidates will need to satisfy the degree requirements for a PhD in Research of their graduating institution, as well as the program requirements of the AIMS DTP-DS. The PhD degree will be conferred by the partnering institution upon successful completion; AIMS is providing international co-supervision partnerships, funding, and additional training in research skills and transferable skills.
Candidates are followed by a supervision team of 2-4 supervisors, forming a partnership between AIMS and higher education institutions in Africa and internationally. The supervision team will be formed during Phase 2 of the application process in communication with shortlisted candidates, the DTP-DS management board, and potential supervisors. Candidates have the possibility to suggest their own supervision team. Each supervision team should consist of at least one supervisor affiliated with AIMS, and one supervisor affiliated with the graduating institution. These rules are flexible and details can be discussed and adjusted on a case-by-case basis.
Supervisors who are proposing research topics to PhD candidates as part of this program come from institutions across the globe, for example:
- University of Rwanda, Rwanda
- University of Ghana, Ghana
- University of Stellenbosch, South Africa
- University of the Witwatersrand, South Africa
- University of Cape Town, South Africa
- University of Pretoria, South Africa
- The Nelson Mandela African Institution of Science and Technology, Tanzania
- University of Cheikh Anta Diop, Senegal
- University of Yaounde, Cameroon
- University of Ibadan, Nigeria
- Jet Propulsion Laboratory, NASA, United States
- Aalto University, Finland
- Lappeenranta-Lahti University of Technology, Finland
- University of Oxford, United Kingdom
- Lancaster University, United Kingdom
- University of Bonn, Germany
- University of Tübingen, Germany
- Dedan Kimathi University of Technology, Kenya
Applicants can select from a list of research topics suggested by leading researchers in their field. Alternatively, applicants are welcome to suggest their own research topics. Shortlisted candidates will be put in touch with potential advisors for discussions on more concrete research ideas in Phase 2 of the application process.
All candidates are invited to participate in an intensive training school in the first year of the program, organized at AIMS Rwanda. Here, candidates will acquire relevant skills to their research and broaden their subject knowledge in data science through a small number of intensive core courses taught by top international researchers.
The program plans to provide continuous training opportunities virtually and/or in person. Additional training components may include (but are not limited to):
- Guided seminars and reading groups
- Participation in transferable skills courses (academic writing, presentations skills, research methodology course)
- Group projects / mini dissertations
- Designing and delivering a mini-course (senior PhD students)
- 3 Minute Thesis Competition (senior PhD students)
- Tutoring in AIMS structural masters program (senior PhD students)
PhD candidates are encouraged to pursue internships in industry or external institutions towards the end of their PhD in a field related to their research topic, depending on sufficient progress on their dissertation.
A)Learning General Riemannian Regression Curves on Shape Spaces
In geometry processing spaces of curves or spaces of surfaces are often considered as Riemannian manifolds, where points in such an infinite-dimensional space represent single curves or surfaces. In analogy, in image processing spaces of images can also be considered as shape manifolds. Given a suitable Riemannian metric, shortest paths realize distances between the endpoints, the geometric exponential map allows the extrapolation of animations, parallel transport enables the transfer of details from one object to another object in the space.
A fundamental task in the processing of large data sets of time-dependent samples is to identify the characteristic dynamics represented by this database. On Euclidean spaces, linear, polynomial, or more general regression curves are possible representations. The question is how to translate general regression models to Riemannian manifold and learn the dynamics presented by large input data.
This type of dynamics can be formulated with evolutional differential equations and higher-order covariant derivatives. Based on the notion of discrete geodesic paths suitable time discretization have to be derived and used in the formulation of the actual regression problem. This approach will then be used to learn regression curves on interesting shape manifolds.
Background & Skills:
Prerequisites: knowledge in Riemannian geometry, optimization, numerical simulation.
B) Theory of Neural Machine Translation for low-resource languages
How can we build effective neural machine translation algorithms for languages for which there is very little training data? This problem has two aspects: (1) there may not be high-quality translation data for any given language (e.g. to English) and (2) for many languages that are primarily spoken, there may not even be significant amounts of monolingual vernacular data. This project will work to address one, or potentially both, of these challenges.
Background & Skills:
Experience in python and machine learning would be very useful.
C) Geospatial and/or Longitudinal Statistical Methods for Population Health Research
Prof Peter Diggle and Dr Emanuele Giorgi are members of CHICAS (Centre for Health Informatics, Computation and Statistics) within the Lancaster Medical School. Their research interests are in the development of statistical methods and associated software for the analysis of longitudinal and geospatial data motivated by and applied to global population health research.
Within this general area, the primary focus of a particular PhD project could be any of the following:
– development and evaluation of novel statistical models and associated methods of inference;
– substantive population health research requiring innovative application of existing statistical models and methods of inference;
– novel methods and user interfaces for real-time health surveillance.
For further information about CHICAS, see https://chicas.lancaster-university.uk
Background & Skills:
1) Degree-level education in statistics 2) Motivation to work on population health applications
D) Accelerating multitask reinforcement learning with attention mechanisms
Reinforcement learning has recently been successful in behaviour learning in a variety of high-profile, complex tasks. Unfortunately, it is generally very sample inefficient, which has implications for it being widely used in real-world problems. Attention mechanisms such as transformers have recently provided significant benefits in other classes of temporal domains. We propose to leverage recent advances in attention to investigate whether a learning agent can learn to focus only on relevant features of a problem, thus greatly accelerating learning. In addition, we will explore opportunities this provides to learning invariant properties and objects in an environment: knowledge which can be exploited to solve new problem instances.
Background & Skills:
Required: multivariate calculus, linear algebra, optimisation, basic familiarity with concepts in machine learning, strong Python programming experience
Recommended: familiarity with reinforcement learning or attention mechanisms
E) AI Augmented Computational Ophthalmology Assessment of Eye Health
The background of the eye (fundus) is the only organ where the microcirculation can be observed directly. Images obtained by non-invasive examination of the fundus provide a plethora of relevant clinical signs that can guide clinical decision making ranging from retinal degeneration to cardiovascular and metabolic syndromes. In this context, wider access to reliable and interpretable analysis of such images is of importance in guiding prevention and treatment services to large populations in the sub-Saharan region.
This PhD project aims to develop scalable and interpretable AI driven computational ophthalmology solutions within the constraints of clinical settings observed in low- to middle-income countries of the sub-Saharan region. In partnership with the German startup eye2you, we will use mobile fundus cameras with a specialized smart phone app to collect high quality images of the eye fundus in tertiary care clinical settings of the College of Medicine of the University of Ibadan, Nigeria. These will underpin the development of medical decision support tools for detecting back of the eye sings that associate with communicable and non-communicable diseases affecting the local community. Based on this first African fundus image data set, we will attempt to solve a range of open AI problems involved in making such a system work in practise: a) transfer learning for interpretable deep learning models; b) detection and correction in data set shifts to be able to use US/EU cohorts for training the system; c) weak labelling strategies and d) model deployment in mobile computing architectures.
Background & Skills:
Deep learning frameworks, Python programming, android app development, statistics, computer vision, interest in medical applications & working with clinical personnel, interest in data acquisition.
F) Uncertainty quantification and inverse problems for epidemiology and satellite remote sensing
Spatiotemporal Gaussian and non-Gaussian random field models can be used for studying satellite remote sensing data. Our objective is in state and parameter estimation algorithms. We model random fields via stochastic partial differential equations driven by Gaussian or non-Gaussian noise. This also allows state-space presentation, and thus the deployment of Bayesian filtering and smoothing techniques. In linear Gaussian cases, we can resort to Kalman filtering, and when we have more complex models, we end up using advanced methods, including extended and ensemble Kalman filters, and sequential Monte Carlo and SMC^2.
As applications, we use satellite data (Landsat-8, Sentinel-5, and current and future hyperspectral imagers EnMAP, PRISMA, Chime, and AVIRIS-NG) for various targets, including plants. Modelling and remote sensing can cover plant diseases crucial for African food security. Satellite images allow online monitoring of changing population densities crucial for epidemiological modelling. Specific target will be agreed with the doctoral candidate, but we name some possibilities: Forestation/degradation and vegetation remote monitoring with satellite data (preparation for the forthcoming NASA mission on Surface Biology and Geology, observing a wide range of spectral bands); Aerosol and atmospheric pollutant distribution analysis; Invasive species monitoring in ocean coast and lakes, like water hyacinth.
Background & Skills:
Bayesian statistics, inverse problems, applied mathematics, computational methods
G) Unravelling RNN or Transformer Interpretability for Behavioural understanding
We are interested in finding ways to better understand the internal models in deep learning models in the application area of natural language processing. Specifically, we are interested in how we can understand behavioural factors (how people write) in applications domains such as health, cybersecurity, education etc.
Background & Skills:
Natural Language Processing, Deep Learning, Interpretable Machine Learning
H) Leveraging machine learning to improve satellite rainfall estimates for African rainfed agriculture
This topic builds from a PhD topic in the AIMS Climate Sciences doctoral training programme which uses statistical methods to validate satellite estimates. The current methods for estimating rainfall from satellite data focus on the relationship between cloud top temperature and rainfall occurrence. In the course of this work we are convinced there is the potential to use machine learning on the raw satellite data for the identification of rainfall event types, which may be identifiable as meteorological phenomena. This approach could enable the identification of extreme events which is particularly important for specific applications.
Background & Skills:
Familiarity with working with complex data sources, including experience with gridded data.
I) Precision mapping of vulnerable ecosystems using sensor and satellite data
This project aims to leverage satellite and sensor data to map vulnerable ecosystems. The data sets envisioned include camera trap data, weather data, and other novel data sets collected from different ecosystems.
Background and skills: An interest in ecology
J) Deep Transfer Learning
The project would attempt to push the boundaries of what is theoretically understood about deep transfer learning. Deep Transfer Learning has almost become the norm of how we train deep learning models nowadays. However, advances in the understanding of the theoretical foundations of deep transfer learning are quite limited in comparison. The work will build on known results using metrics involving distributions of input data but also explore other techniques from other domains of mathematics and statistics. Simulations will be used where possible to support theoretical findings.
Background & Skills: Strong BSc in mathematics and/or statistics, programming skills
Chief Scientific Officer
AIMS – Secretariat
German Research Chair in Data Science
AIMS South Africa, Stellenbosch University
Bonn Junior Fellow, AIMS-Carnegie Research Chair in Data Science
University of Bonn
QLA – AIMS
|Isambi Sailon Mbalawata|
Research and Scientific Development Manager
AIMS – Secretariat
IBM Research Kenya
Nelson Mandela African Institution of Science and Technology (NM-AIST)
University College London
University of Essex
Google AI Ghana
University of Pretoria
University of Illinois at Urbana-Champaign