Reinforcement Learning course

Every year since 2020 we offer a Reinforcement Learning in depth course at the Department of Engineering of the University of Padova

Prerequisites

Course prerequisites include knowledge of machine learning fundamentals (e.g. covered through the 'Machine Learning' course). Knowledge of elements of probability and statistics, calculus and optimization algorithms are also expected. Previous programming experience with Python is expected for project assignments.

Target skills and knowledge

A student successfully completing the course should be able to lay down the key aspects differentiating Reinforcement Learning from other machine learning approaches. Given an application, the student should be able to determine (i) if it can be adequately formulated as a Reinforcement Learning problem; (ii) be able to formalize it as such and (iii) identify a set of techniques best suited to solve the task, together with the software tools to implement the solution.

More informations

For more informations, please consider checking the course page at https://en.didattica.unipd.it/ under Year > Second cycle degree courses > School of Engineering > Control System Engineering > Reinforcement Learning . Furthermore, the course has a Moodle page available from https://stem.elearning.unipd.it, and you can also find previous years' materials. For the course it's suggested the book Reinforcement Learning: An Introduction from Richard S. Sutton and Andrew G. Barto.

Thesis

We offer a wide variety of thesis for students that attended the course, though other thesis topics might be available, mainly the areas of Anomaly Detection and Predictive Maintenance.
Regarding RL-thesis, a student can decide to carry out his thesis in a company or at the university. In both cases, a PhD student will be assigned to the student as a reference point in case of major problems or for suggestions.
Furthermore, students can propose their own topic in which they are interested for the thesis, though it's necessary to discuss about it with the professors.

Some thesis proposals are:

Application of Continual learning in multi-agent reinforcement learning scenarios
Application of Continual Learning techniques for robotics manipulation on the edge
The cost of learning dilemma

In the last years, Artificial Intelligence (AI) has become an increasingly central actor in our lives, enabling technological solutions to adapt to the immediate needs of the end-users. The shift from a static to a continuously evolving technology does not come for free but presents a critical cost in terms of communication bandwidth, computational power, and other network resources. Hence, telecommunication networks are facing a competition between traditional data flows and those supporting the training of AI algorithms. This leads to a new dilemma: how can we balance network resources among end users and learning agents?

The scientific community ha starting being aware of the issues associated with the communication and computational overhead due to AI. Despite this, most works still assume that learning and user data are exchanged on separate channels, ignoring the dependency between the agent training and network conditions. To overcome such a limit, it is necessary to investigate how to implement AI solutions from a new perspective.

This project aims at defining new strategies for combining the optimization of network resources and learning algorithms, analzying how the improvement of the AI training implies a resource reduction for target applications. The project consists in identifying significant use cases where the "cost of learning" dilemma arises, modeling the scenarios theoretically, and proposing new solutions for the described problem.

For more information, contact Federico Mason at federico.mason@unipd.it
Edge Computing-based Automated Driving in Duckietown
Duckietown is a simple platform for automated driving, in which small cars need to navigate a town with pedestrians, traffic lights and signs, and other vehicles. The cars use an onboard nVidia Jetson board to run the computer vision and decision-making algorithms, but there is a reasonable need for offloading this processing to a central node. In this case, the wireless channel will play a significant role, as the latency deadlines are very tight and the data that need to be transferred can have a significant size.
In this thesis project, you will work to evaluate and improve deep reinforcement learning-based driving algorithms in constrained communication scenarios in Duckietown. Aside from the standard simulator, a physical testbed is available in the lab, and working with it will be a major component of the project.
PREREQUISITES: in order to work on this topic, you should have completed the "Neural networks" and "Reinforcement learning" courses.
CONTACTS: federico.chiariotti@unipd.it
Security in Goal-Oriented Communications
Goal-oriented or effective communication is a new paradigm that allows sensors to only transmit the data that is relevant to the receiver, e.g., by omitting redundant or irrelevant information. However, this poses significant security challenges: if the transmitter uses dynamic compression, adapting the length of packets to the required amount of data, an eavesdropper may glean some information about the system from the timing and length of packets, even if their content is protected through encryption. This project involves the design of a learning-based eavesdropping attack on an effective communication system, evaluating the performance of the attacker and the possible countermeasures that the transmitter can employ.
PREREQUISITES: in order to work on this topic, you should have completed the "Neural networks" and "Reinforcement learning" courses.
CONTACTS: federico.chiariotti@unipd.it
Deep Reinforcement Learning for sensor scheduling
Wake-up radio is a technology that allows Internet of Things (IoT) nodes to respond to requests, which can be ID- or content-based (in the former case, the sensor will send its latest reading if its ID is in the request, while in the latter, it will transmit if the data matches the conditions specified in the request message). This type of system allows for interesting scheduling opportunities, particularly if the sensor measurements are correlated: by carefully crafting scheduling requests, the IoT gateway can save energy and improve its estimate of the state of the system.

This project involves the design of a Deep Reinforcement Learning algorithm to achieve these two goals at the same time by using knowledge on the system state and statistics as well as the sensors' battery states and capabilities. The problem can also include other factors, such as selecting different objective functions over the system state.
PREREQUISITES: in order to work on this topic, you should have completed the "Neural networks" and "Reinforcement learning" courses.
CONTACTS: federico.chiariotti@unipd.it
Goal-Oriented Integrated Sensing and Communication
The Integrated Sensing and Communication (ISAC) paradigm uses wireless communication to sense the environment, using multi-antenna systems to infer, e.g., the position, identity, and current activity of people in a room by learning the perturbations they cause in a wireless channel. This is a very promising approach for 6G, as it would allow the network to be aware of what happens in its surroundings, adapting to events in real-time without the need for external sensors and cameras. However, the communication burden from transmitting all the channel sensing data for processing and learning is significant. Effective communication is a smart compression technique that can allow nodes to abstract the relevant information in the data, transmitting only the features that are necessary for the task at hand, e.g., for identifying the person by their gait and physique. The thesis project involves analyzing ISAC data and exploiting deep learning techniques to perform semantic compression, reducing the communication footprint of the target application while maintaining the same performance.
PREREQUISITES: in order to work on this topic, you should have completed the "Neural networks" and "Reinforcement learning" courses.
CONTACTS: federico.chiariotti@unipd.it