PhD Call: Large-Scale Learning over Distributed Streaming Data

Research Fields: Machine & Statistical Learning, Distributed Computing, Streaming Data, Internet of Things, Big Data Intelligence

Description: We are looking for an excellent candidate who will pursue a PhD on the development of new large-scale machine learning methods for distributed, streaming contextual data generated in the context of the Internet of Things (IoT). Such methods will become the basis for building intelligent applications over IoT streaming data. IoT is a part of future Internet and comprises many billions of devices (‘things’) that sense, compute, communicate, share knowledge, and actuate. Such devices incorporate machine intelligence, physical/virtual identities, contextual sensors, RFIDs, social media, etc. The vision of IoT is to allow ‘things’ to be connected any-time, any-place, with anything and anyone. Some emerging Big Data applications based on knowledge derived from streaming contextual information include emergency situations awareness, smart city applications, remote sensing and environmental monitoring.

Challenges: Learning from contextual data and adaptation to changes in the context of IoT sets forth a number of challenges that have to do with the nature of the contextual data and the processes that generate them. Contextual information coming from IoT devices has a strong spatio-temporal dimension, which needs to be taken into account during the data modelling and learning process heading for reliable knowledge inference/reasoning and context awareness. Moreover, research challenges relate to in-network contextual data and knowledge fusion, and localized decision making, which deals with the redundancies and interactions that exist among the distributed contextual data sources. In addition, IoT devices regularly fail, e.g. limited battery life-time and loss of connectivity, thus, resulting to incomplete contextual data availability. It is challenging for a learning model, context prediction and adaptation algorithm to cope with incomplete and missing contextual information, concept drift and changing data distributions.

Enrolment & Opportunity

The successful candidate will enrol as a PhD student at the School of Computing Science, University of Glasgow, under the supervision of Dr Christos Anagnostopoulos and will join the Information, Data, Event, and Analytics at Scale (IDEAS) research team of the University of Glasgow, led by Professor Peter Triantafillou. Our research team explores a number of different issues such as: machine and statistical learning in high dimensional settings, dimensionality reduction, scalable, distributed machine learning techniques for Big Data systems, complex analytics query processing and optimization (temporal analytics, approximate query answers), with applications on urban data, smart cities, and polyomics data. For a more detailed description the interested candidates may visit: and the list of publications within there.

The University of Glasgow is a world-renowned education and research hub, offering considerable opportunities for training and exposure to machine learning, large-scale analytics, and distributed computing with a number of research teams in the School of Computing Science being active on these and related fields. In addition the selected candidate will have ample opportunities to participate in the top conferences of distributed computing, large-scale learning and mining, and data engineering.


The ideal candidate will have a background in computer science and some background in either mathematics or statistics. Special areas of interest include: statistical machine learning, basics on statistics, and/or mathematical modelling/optimization. A good understanding of the basic machine learning methods/algorithms as well as an MSc in one of the above areas will be a considerable plus. Programming skills, good command of English and team work capacity are required.

Further Information

Questions regarding academic and research aspects of the position should be directed to Dr Christos Anagnostopoulos by e-mail:

For general enquiries about the application process visit our How to Apply page.


Anagnostopoulos, C., and Triantafillou, P. (2015) Learning set cardinality in distance nearest neighbours. In: IEEE International Conference on Data Mining (IEEE ICDM 2015), Atlantic City, NJ, USA, 14-17 Nov 2015.

Anagnostopoulos, C., and Hadjiefthymiades, S. (2014) Advanced principal component-based compression schemes for wireless sensor networks. ACM Transactions on Sensor Networks, 11(1), 7.

Anagnostopoulos, C., and Triantafillou, P. (2014) Scaling out Big Data Missing Value Imputations: Pythia vs. Godzilla. In: 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’14), New York, N.Y., U.S.A, 24-27 Aug 2014, pp. 651-660.

Anagnostopoulos, C. (2014) Time-optimized contextual information forwarding in mobile sensor networks. Journal of Parallel and Distributed Computing, 74(5), pp. 2317-2332.

PhD Call: Large-Scale Learning over Distributed Streaming Data