The Research Data Management Development Fund was first established in 2022 by the Committee on Research Data Management of the University’s Research Committee. It aims to encourage CUHK researchers and students to embed research data management (RDM) practices through entire research life cycle, improve data management process, and promote the use of research data facilities and services at CUHK. All full-time CUHK staff members on professoriate or research academic ranks (i.e. from “Research Assistant Professor” to “Professor”) were invited to apply for the funding as principal investigator.
Following the Fund offered in 2022, the Research Data Management Fund supported 18 projects in 2023 with a total of HKD 1,797,835 being awarded. Details of the projects are as follows:
Faculty of Arts
Faculty of Business Administration
Faculty of Education
Faculty of Engineering
Faculty of Law
Faculty of Medicine
Faculty of Science
Faculty of Social Science
Faculty of Arts |
Title |
Corpus and Artificial Intelligence: A Step Toward Automated Essay Scoring |
Principal Investigator/
Co-investigator(s) |
Name |
Affiliation |
Principal Investigator |
HWANG Haerim |
Department of English |
Abstract |
The proposed project aims to establish a publicly-available database of learner essays for utilization in Automatic Essay Scoring (AES); and develop and assess diverse AES systems. To this end, the project endeavors to disseminate good research data management practices through a step-by-step approach: We create an English learner essay corpus incorporating learner-related information and then annotate each essay with scores provided by human raters. Using this corpus, the project develops various AES systems in Python based on different machine learning techniques and large language models, such as GPT4, and then assesses them for accuracy. Upon completion of the project, the corpus data, Python codes, and publications will be made accessible via online repositories. Sharing these deliverables is expected to facilitate the reproducibility and sustainability of the project, and enable the ready application of AES to new testing scenarios. While the PI assumes full responsibility for all project processes, undergraduate students, student helpers, graduate research assistants will participate in different stages of the project to integrate research data management practices into their research life cycle. Through diverse activities, such as conference presentations and sharing workshops, this project will make significant contributions to advancing AES and data management practices. |
Start Date |
1-Feb-2024 |
End Date |
30-Jan-2026 |
Faculty of Arts |
Title |
Language Development of Bilingual and Trilingual Children in Hong Kong |
Principal Investigator/
Co-investigator(s) |
Name |
Affiliation |
Principal Investigator |
YIP Choy Yin Virginia |
Department of Linguistics and Modern Languages |
Co-investigator |
MAI Ziyin |
Department of Linguistics and Modern Languages |
Co-investigator |
ZHOU Jiangling |
Department of Linguistics and Modern Languages |
Abstract |
This project aims to build a robust database of longitudinal speech data documenting the development of Hong Kong bilingual and trilingual children exposed to Cantonese, Mandarin and English in early childhood. The developmental data covers the preschool years from the onset of first words to production of complex grammar.
The project will benefit from a clear research data management plan. We aim to enhance the existing transcripts and improve the accuracy of the tagging of parts of speech of the child and adult utterances. Improvement of accuracy in tagging will greatly enhance the overall quality of the corpus.
The data will allow us to determine the developmental milestones for the children under investigation at different developmental stages and address important theoretical and empirical research questions and assess a number of factors that contribute to bilingual and trilingual development including the role of parental input, language dominance, degrees of balance in the target languages and crosslinguistic influence. |
Start Date |
1-Feb-2024 |
End Date |
31-Jan-2026 |
Faculty of Business Administration |
Title |
Financial Reporting Quality and International Trade |
Principal Investigator/
Co-investigator(s) |
Name |
Affiliation |
Principal Investigator |
ZHOU Yuqing |
School of Accountancy |
Abstract |
Globalization has advanced at an unprecedented pace in recent decades. International trade, in particular, has swelled. Understanding the impetus for this growth is critical for the world economy. This project highlights one potential factor, financial reporting quality, and examines whether its improvement facilitates international trade. An investigation of this question can illuminate ways to address frictions, such as information asymmetry, that inhibit trade, can advance understanding of the economic consequences of improved transparency, and can provide a link between financial reporting quality, corporate sector transparency, and economic growth.
Specifically, I will use China and US export and import data to investigate how does firms’ financial reporting quality affect their export and import value and the number of trading partners. First, I will use survey data from executives to measure accounting quality and conduct country-sector-level analyses. Second, I will use firm-level international trade data to conduct more detailed analyses.
International trade is crucial for the global economy. The evidence provided in this project extends the understanding of the real economic effects of high quality financial disclosure. Moreover, there are important policy implications for developing countries that heavily rely on international trade for economic growth and at the same time have low financial reporting quality. |
Start Date |
1-Jan-2024 |
End Date |
31-Dec-2025 |
Faculty of Business Administration |
Title |
Mapping the Footprint of Crimes in China |
Principal Investigator/
Co-investigator(s) |
Name |
Affiliation |
Principal Investigator |
KO Chiu Yu |
Department of Decisions, Operations, and Technology |
Abstract |
The confidentiality of crime records often hinders various social research, making it challenging for individuals seeking a comprehensive understanding of the crime landscape. In this project, we make the first attempt to introduce a standardized and robust approach to geocode the criminal records that extracted from China’s court judgments. Our primary objective is to introduce a standardized and robust methodology for geocoding criminal record, which has not been studied before. By leveraging advanced geocoding techniques, we can accurately identify the precise address of each crime case, thus facilitating extensive future research.
Through this innovative methodology, we aim to establish the first crime map in China that provides an accurate footprint for each individual crime instance. This proves invaluable for scholars, as it provides deeper insights into the spatial distribution of crimes, instead of just looking at the number of incidents. By analyzing the geographical patterns and clusters of criminal activities, we can uncover concealed trends and correlations that illuminate the dynamics of criminal behavior.
Through the creation of such crime map, we aim to equip policymakers, researchers, and communities with the knowledge necessary to develop effective crime prevention strategies and enhance urban safety in China. |
Start Date |
1-Feb-2024 |
End Date |
31-Jul-2025 |
Faculty of Education |
Title |
Educational Opportunities and Social Mobility in Greater Bay Area (GBA): A Comparative Analysis in Hong Kong and Guangdong |
Principal Investigator/
Co-investigator(s) |
Name |
Affiliation |
Principal Investigator |
OU Dongshu |
Department of Educational Administration and Policy |
Co-investigator |
WONG Kenneth K. |
Education Department, Brown University |
Abstract |
We continue our current research on educational opportunities and social mobility in Greater Bay Area including Hong Kong and Guangdong province. Using various data sources, such as Hong Kong Census data and China Family Panel Studies, we analyze three topics: (1) educational development and wellbeing of children, (2) intergenerational mobility, and (3) determinants of economic and educational disparity across GBA. Our study provides an updated picture on the trend of social inequality in Hong Kong and Guangdong province as well as a comparison of the two education systems and labor markets. Research results have implications on the utilization and reward of the human capital in the labor market of GBA area and understanding of potential barriers that create social inequality in the two societies. |
Start Date |
1-Jan-2024 |
End Date |
31-Dec-2025 |
Faculty of Engineering |
Title |
Research Data Management for Audio Generation |
Principal Investigator/
Co-investigator(s) |
Name |
Affiliation |
Principal Investigator |
KONG Qiuqiang |
Department of Electronic Engineering |
Abstract |
Audio and music understanding is an important research topic in audio-based artificial intelligence (AI). In previous research, there is a lack of high-quality data for the research purpose, such as well labelled music data with genres, beats, structures, etc. In recent days, high-quality data has been essential to train large-scale neural networks. High-quality datasets are also beneficial to the research community. We propose to collect and process high-quality data from the internet. We design machine learning-based algorithms to automatically predict the quality of data. We also design to augment the labels with natural language captions. Then, we hire research assistants, students, and music experts to label the high-quality labels of the audio clips. We aim to collect 100 hours of data containing rich information of sound events and pitches of audio events. The collected sound events cover 200 sound species in our world. We will collect the metadata of the dataset and hire professionals to subjectively verify the effectiveness and quality of the datasets. All datasets will be stored locally and on the internet. We will release the datasets to the public after collection. |
Start Date |
1-Jan-2024 |
End Date |
31-Dec-2025 |
Faculty of Law |
Title |
Sentencing Database for Hong Kong Court Cases |
Principal Investigator/
Co-investigator(s) |
Name |
Affiliation |
Principal Investigator |
CHENG Kevin Kwok Yin |
Faculty of Law |
Abstract |
The objective of this project is to embed research data management (RDM) practices through the research life cycle of the PI’s existing General Research Fund (GRF) project. The GRF project builds an original dataset through coding the Hong Kong Judiciary’s ‘Reasons for Sentence.’ The dataset captures sentencing factors, namely aggravating and mitigating factors, and sentence decisions across the three offences of drugs trafficking, assault, and burglary. The RDM aspect provides a hands-on experience to plan the data management, data acquisition, and data analysis of this dataset. The dataset will be securely stored and have necessary backup. The dataset will be uploaded to the CUHK Research Data Repository to ensure data preservation, sharing, and reuse for potential future projects. The valuable experience gained through RDM practices will be shared with colleagues and students. |
Start Date |
1-Jan-2024 |
End Date |
31-Dec-2025 |
Faculty of Medicine |
Title |
Collection and Management of the Data from a Representative Sample of Literatures for Research on Evidence-based Medicine |
Principal Investigator/
Co-investigator(s) |
Name |
Affiliation |
Principal Investigator |
YANG Zuyao |
The Jockey Club School of Public Health and Primary Care |
Co-investigator |
SHA Feng |
Center for Biomedical Information Technology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences |
Abstract |
Research evidence on the effects of interventions such as drug treatments, vaccination, and physical exercise can be used to inform prevention and treatment of diseases and improve population health. Meta-analysis of randomized controlled trial is widely recognized as the best evidence to judge whether a particular intervention is effective. A random, representative sample of up-to-date meta-analysis papers and the data extracted from such papers can be used to investigate many important issues in evidence-based medicine. The PI of this project is experienced in doing meta-analyses to evaluate the effects of various interventions and has collaborated with the Co-I before. In this project, they will collect and manage the data from a random, representative sample (n=4,500) of up-to-date meta-analysis papers that can be used to answer multiple questions in evidence-based medicine as well as in the teaching of related courses. Around 300 eligible papers will be identified. From each paper, 21 data items will be extracted, containing no confidential or sensitive information. A detailed Data Management Plan has been created via DMPTool (ID: https://doi.org/10.48321/D1BH3C). Expected deliverables and outcomes include four datasets that can be shared with other researchers, one manuscript for publication, two Master students trained for methodological and data management skills, and a set of materials that can be used in future teaching. |
Start Date |
1-Jan-2024 |
End Date |
31-Dec-2024 |
Faculty of Medicine |
Title |
Determinants of Healthy Ageing – MrOS and MsOS (Hong Kong) Cohort Study |
Principal Investigator/
Co-investigator(s) |
Name |
Affiliation |
Principal Investigator |
KWOK Timothy Chi Yui |
Department of Medicine and Therapeutics |
Co-investigator |
LU Zhihui |
Department of Medicine and Therapeutics |
Co-investigator |
LEUNG Jason |
The Jockey Club School of Public Health and Primary Care |
Abstract |
The proposed project plans to conduct a year 20 follow-up study the older people from the MrOS and MsOS (Hong Kong) cohort study. The MrOS and MsOS (Hong Kong) cohort study was the first-ever cohort study in Asian to examine the determinants of osteoporotic fractures in older Chinese men and women. A variety of health-related data (over 7500 variables) has been collected from 2,000 men and 2,000 women aged 65 years or above since the baseline in 2001. The cohort has been followed up for 20 years and has contributed more than 200 publications. The project aims to ascertain the factors or biomarkers of healthy ageing in the community-dwelling older people.
A systematic and self-explanatory database will be established and the data will be deposited at the CUHK Research Data Repository (https://researchdata.cuhk.edu.hk). The database will be accessible to approved researchers and students undertaking research into a wide range of age-related topics and will contribute to improve our understanding of healthy ageing. We are open to various collaborations. Further information on this cohort and the database can be found in our website (http://www.jococ.org/zh-hk/mros-msos.php). |
Start Date |
1-Jan-2024 |
End Date |
30-Jun-2025 |
Faculty of Medicine |
Title |
Factors Affecting Specialist Outpatient Involvement in Medical Decision-Making |
Principal Investigator/
Co-investigator(s) |
Name |
Affiliation |
Principal Investigator |
WONG Eliza Lai-Yi |
The Jockey Club School of Public Health and Primary Care |
Co-investigator |
TIAN Yue |
The Jockey Club School of Public Health and Primary Care |
Abstract |
Background: Actively engaging patients in shared decision-making (SDM) is recognized as an essential approach to achieving patient-centered care and improving health outcomes. However, the implementation of SDM in routine medical care remains problematic.
Research Objectives: This study aims to investigate what factors may affect specialist outpatient participation in medical decision-making in Hong Kong.
Methods: This study includes two phases. In Phase I, in-depth reviews (n = 15) will be conducted to explore patients’ experiences and their perceived barriers and facilitators of SDM in specialist outpatient settings. In Phase II, a cross-sectional survey (n = 600) will be conducted among specialist outpatient attendees. Informed by the insights from Phase I and the literature, the survey will gather quantitative data on patient demographics, preferred and actual involvement in medical decision-making, and factors influencing patient involvement in SDM. Additionally, the questionnaire will incorporate a subset of open-ended questions to investigate factors affecting patients’ experiences related to SDM.
Conclusion: This will be the first study combining qualitative and quantitative methods to explore factors influencing patient involvement in decision-making in Hong Kong. The outcomes of this research can inform healthcare practices, enhance patient-centered care, and contribute to the implementation of SDM in specialist outpatient settings. |
Start Date |
1-Mar-2024 |
End Date |
1-Mar-2025 |
Faculty of Medicine |
Title |
Development and Validation of a Standardized Questionnaire in Health Care Utilization in Community-living Adults with Chronic Conditions in Hong Kong |
Principal Investigator/
Co-investigator(s) |
Name |
Affiliation |
Principal Investigator |
HUO Zhaohua |
Department of Psychiatry |
Co-investigator |
LAM Linda C.W. |
Department of Psychiatry |
Co-investigator |
YIP Benjamin H.K. |
The Jockey Club School of Public Health and Primary Care |
Co-investigator |
LEE Allen Ting-Chun |
Department of Psychiatry |
Abstract |
With increasing interests and studies to evaluate the economic and social impacts of diseases and interventions, a great demand emerges on measuring people’s service utilization. To date, standardized and validated instruments are still very limited, and they have problems of application in different contexts. Our study aims to develop and validate a standardized questionnaire to measure the health care utilization (HUQ) of community-living adults with chronic diseases in Hong Kong. Literature review and expert interviews will first be conducted to develop a protocol of HUQ. For validation, the HUQ will be tested on an estimate of 150 older adults aged ≥60 who have at least one chronic disease. During a three-month period, participants are required to complete a cost diary in fixed terms. They are also required to complete the HUQ at baseline and the end of study. The administration time and cost, test-retest reliability and validity of the HUQ will be analyzed. All data collected in this study will conform to Guidelines on Research Data Management and open to the community. We expect this instrument is acceptable, reliable and valid, and can be widely used for service utilization collection in different studies in the local context of Hong Kong. |
Start Date |
1-Mar-2024 |
End Date |
30-May-2025 |
Faculty of Medicine |
Title |
Multiscale Neuroimaging-based Data Integration for Mapping the Radiomics of Human Brain |
Principal Investigator/
Co-investigator(s) |
Name |
Affiliation |
Principal Investigator |
LU Hanna |
Department of Psychiatry |
Co-investigator |
ZHANG Li |
Department of Mechanical and Automation Engineering |
Co-investigator |
LI Yu |
Department of Computer Science and Engineering |
Abstract |
To promote the life circle of big-data is becoming a standard practice in the neuroscience field through data-sharing initiatives. Multimodal neuroimaging data grants a powerful window into the complex structures of human brain at multiple scales. Recent conceptual and methodological advances enable the investigations of the interplay between the large-scale spatial trends in brain macrostructure, mesostructure and connectivity, offering an integrative framework to study multiscale brain organization. Particularly, radiomics, as a method that extracts a large number of features from medical images, are combined with deep learning algorithm and able to transform high throughput conversion of images to mineable data.
Here, we share three processed and validated magnetic resonance imaging (MRI) datasets acquired from the Cambridge Centre for Ageing and Neuroscience (Cam-CAN), the Harvard Aging Brain Study (HABS) and the Hong Kong SuperAgers study (ClinicalTrials.gov ID: NCT05728801), including high-resolution T1-weighted structural MRI and resting-state functional MRI (rs-fMRI) at 3 Tesla. These three datasets contain structural and resting-state functional MRI (rs-fMRI) neuroimaging data that has been acquired from 643 English healthy participants, 220 American participants and 488 Chinese old adults. Alongside, we share large-scale gradients estimated from each modality and constructed radiomic models. Our processed datasets will facilitate future research examining the coupling between brain macrostructure, mesostructure, connectivity, and cognitive function. |
Start Date |
1-Oct-2024 |
End Date |
30-Sep-2025 |
Faculty of Science |
Title |
Biodiversity Genomic, Transcriptomic and Sanger Sequencing Data Management: Common Practices and Workshops |
Principal Investigator/
Co-investigator(s) |
Name |
Affiliation |
Principal Investigator |
HUI Jerome Ho-lam |
School of Life Sciences |
Co-investigator |
SO Wai Lok |
School of Life Sciences |
Co-investigator |
YIP Ho Yin |
School of Life Sciences |
Co-investigator |
NONG Wenyan |
School of Life Sciences |
Abstract |
With the advancement of sequencing technologies, the next generation and third generation sequencing become more affordable to individual groups. Different fields are now generating genomic and transcriptomic datasets at an unprecedented speed. Together with the conventional Sanger sequencing data, all these raw and/or processed data represents an important reusable resource serving various functions.
In this proposed 6-month project, we aim to set up the common practice guidelines for storing these sequencing data into the CUHK Research Data Repository. We will first invite a group of students and researchers to input different kinds of data to the system, and via discussing and interviewing the participants, we will formulate a protocol based on the feedback. One workshop will then be carried out in the School of Life Sciences to train the students and researchers, and necessary modifications to the protocol will be made if deemed appropriate. Another workshop will be made to everyone in the University, and the established protocol will be made publicly available at the CUHK Research Data Repository website. |
Start Date |
1-Jan-2024 |
End Date |
30-Jun-2024 |
Faculty of Science |
Title |
A Pilot (Cryo-)EM- and Live-cell Imaging Database for Plant Organelle Research to Promote Good Practice of RDM at SLS, CUHK and Beyond |
Principal Investigator/
Co-investigator(s) |
Name |
Affiliation |
Principal Investigator |
LIANG Zizhen |
School of Life Sciences |
Co-investigator |
ZHAI Liting |
School of Life Sciences |
Abstract |
We have established the RGC-AoE “Centre for Organelle Biogenesis and Function” in 2014 at SLS-CUHK to promote collaborative research and education in Hong Kong and beyond. Nowadays research data management (RDM) is an important strategy for organizing, storing, preserving, and sharing research data. In this project, we will use our world-leading research on vacuole biogenesis and function at the AoE Centre as an example to develop good practices and systems for RMD and promotion under the CUHK system, with an ultimate goal of sharing our experience (via CUHK workshop) and promoting good practice for RDM for both research postgraduate students (RPGs) and researchers. All these goals can be achieved by 1) conducting research on (Cryo-)EM and Live-cell imaging database management practices within the context of organelle biogenesis; 2) developing robust and tailored RDM strategies and systems specifically designed for the CUHK system; 3) sharing experiences, insights, and developed resources with both RPGs and researchers and 4) promoting our good RDM practices in the broader plant cell biology community beyond CUHK. This project will be the first of its kind in the field of organelle biogenesis and function to systematically address RDM. |
Start Date |
1-Jan-2024 |
End Date |
31-Dec-2025 |
Faculty of Science |
Title |
Accelerated Computation of Surface Mapping |
Principal Investigator/
Co-investigator(s) |
Name |
Affiliation |
Principal Investigator |
CHOI Pui Tung Gary |
Department of Mathematics |
Abstract |
Surface mapping plays an important role in many science and engineering applications, including shape registration, shape modeling, and shape analysis. For instance, by mapping two surfaces onto a common parameter domain, one can easily establish a 1-1 correspondence between every part of them and hence compare the shape difference between the two surfaces. Also, shape modeling and remeshing can be effectively done with the aid of a suitable parameterization mapping of the surfaces. However, many prior surface mapping methods are computationally expensive and hence difficult to be utilized in large-scale problems. In this project, we aim to develop new surface mapping approaches with accelerated computation, so that surface mapping can be more efficiently achieved for large datasets of various surfaces. In particular, we will develop data-driven and optimization-based methods for computing surface mappings with different desired mapping effects, such that shape deformations can be produced in a highly efficient and accurate manner. The mapping methods can then be efficiently applied to a wide range of practical problems. The mapping datasets generated in this work can further be used as benchmarks for future algorithmic developments. |
Start Date |
1-Jan-2024 |
End Date |
31-Dec-2025 |
Faculty of Science |
Title |
Research Data Management for Synthetic Protocols of Tailored Colloidal Particles |
Principal Investigator/
Co-investigator(s) |
Name |
Affiliation |
Principal Investigator |
KWOK Man Hin |
Department of Chemistry |
Abstract |
The design, preparation and characterization of colloidal particles and their derivatives are extremely important in any colloid science and nanotechnology research. Only consistent colloidal particles samples with high quality will produce scientifically valid comparison and results. However, the detailed procedures of various syntheses were all scattered in numerous journal paper and publications, especially for those tailored designs. A comprehensive colloids synthesis database with good data management structure can readily solve the problem. Therefore, anyone can access the recipes and the procedures for producing the desired colloids and get insight in designing new synthetic protocols based on the existing ones. CUHK DMP tools and CUHK Research Data Repository will be an ideal platform for setting up such database for the other researchers. |
Start Date |
1-Jan-2024 |
End Date |
31-Dec-2025 |
Faculty of Social Science |
Title |
Establishing an Open-Access Database for Sign Language Learning Research: Integrating Behavioral, EEG, and fMRI Data |
Principal Investigator/
Co-investigator(s) |
Name |
Affiliation |
Principal Investigator |
LIU Xiaonan |
Department of Psychology |
Abstract |
This project aims to develop an open-access database that consolidates data from behavioral studies, EEG, and fMRI experiments focused on sign language learning. The primary objective is to examine the effectiveness of different learning methods in the acquisition of sign language. By integrating data from multiple research methodologies, the project seeks to provide a comprehensive resource for researchers studying language learning processes. The database will include a variety of data types, from neural activation patterns captured through EEG and fMRI to behavioral responses observed in learning experiments. This multifaceted approach allows for a more detailed examination of the cognitive processes involved in sign language learning.
The project aims to contribute to the understanding of how different learning strategies impact the acquisition of sign language, which could have implications for educational practices, particularly in the context of teaching the hearing-impaired. By making the database accessible to researchers and educators, the project supports the principles of open science and collaborative research. The findings from this study have the potential to inform educational strategies and contribute to the broader field of cognitive neuroscience and language acquisition. |
Start Date |
1-Jan-2024 |
End Date |
31-Dec-2025 |
Faculty of Social Science |
Title |
Building Digital Self-efficacy for Continued Employment: The Predictors of Older Employees’ Digital Self-efficacy and Contextual Boundary Conditions of Whether It Contributes to Motivation to Continue Working Beyond Retirement Age |
Principal Investigator/
Co-investigator(s) |
Name |
Affiliation |
Principal Investigator |
PFROMBECK Julian |
Department of Psychology |
Abstract |
Due to demographic changes and a lack of skilled workers in many countries, organizations aim to motivate older employees to work beyond retirement eligible age. However, to develop that motivation in an increasingly digitized world of work, older employees need the capability and confidence to master the changes in one’s work environment related to technological developments and applications, which is defined as their digital self-efficacy. Drawing on social cognitive theory, we aim to investigate predictors of older employees’ digital self-efficacy and how it relates to their motivation to continue working beyond retirement age. Further, we aim to investigate whether contextual boundary conditions, such as the organization’s degree of digitization and negative age stereotypes in one’s work environment shape the observed relationships.
To test our hypothesized relationships, we plan to conduct a cross-lagged panel study that observes 600 older employees (i.e., age 50+) living in Hong Kong over the time frame of one year and two experimental studies. The findings will highlight the relevance of older employees’ digital self-efficacy for their work continuance motivation and reveal factors that help to build and maintain digital self-efficacy in older age. Moreover, the results will provide new insights into the role of the organizational context (i.e., digitization of an organization and ageism in the workplace) and how it shapes whether digital self-efficacy translates into motivation to continue working. |
Start Date |
1-Apr-2024 |
End Date |
31-Mar-2026 |