Advancing Research Administration with AI: A Case Study from Emory University

By SRAI JRA posted 05-22-2025 05:46 PM

Recommend

Volume LVI, Number 2

Advancing Research Administration with AI: A Case Study from Emory University

Lisa A. Wilson
Assistant Vice President, Strategic Optimization and Training
Office of Research Administration, Emory University

Benn Konsynski, PhD
Professor
Goizueta Business School, Emory University

Tubal Yisrael
Project Support Specialist, Strategic Optimization and Training
Office of Research Administration, Emory University

Abstract

This case study examines the development of a proof-of-concept (PoC) generative artificial intelligence (genAI) model inspired by OpenAI’s ChatGPT®, implemented within the Office of Research Administration (ORA) at Emory University. Generative artificial intelligence (genAI) refers to AI models capable of producing human-like text. Specifically, this study shares practical insights and experiences from developing a private AI model tailored to support research administration operations. The initiative involved forming a specialized team, identifying a secure and efficient platform, and creating ORAgpt—a chatbot designed to provide Emory’s research administrators with instant and accurate guidance on institutional policies, procedures, and administrative tasks. Key elements discussed include strategies for team selection, data curation, and model architecture, such as leveraging internal subject matter expertise, curating institutional documentation, and deploying cloud-based technology through Microsoft Azure. Despite challenges such as constrained funding and a compressed timeline, the project reached significant milestones, including positive stakeholder feedback and evidence indicating the model's potential to streamline tasks and enhance productivity. The findings underscore genAI’s promise to transform research administration by increasing efficiency and providing a scalable framework for adoption by similar institutions. Future phases will incorporate structured governance, rigorous document vetting processes, and comprehensive financial planning to ensure sustainability.

Keywords: Office of Research Administration (ORA), ORA Knowledge Repository, Goizueta Business School, Generative AI (genAI), Large Language Model (LLM), Tokens, Chatbot, Prompt Engineering, PrivateGPT, FlowiseAI, ORAgpt, Microsoft Azure, OpenAI, OpenAI Studio, GPT-3.5 Turbo, Cloud Storage, Azure AI Search, Azure Synapse Analytics, Azure App Services, Hallucinations, Testing, Evaluation, Validation, and Verification (TEVV), Subject Matter Experts (SMEs), Planview Project Management (PPM Pro), Leena AI, Standard Operating Procedures (SOPs), Intellectual property (IP), National Science Foundation (NSF) Proposal & Award Policies & Procedures Guide (PAPPG), National Institute of Health (NIH), Code of Federal Regulations (CFR), Personally Identifiable Information (PII).

Introduction

The rapid development of artificial intelligence (AI) has opened new frontiers for enhancing efficiency across a wide range of industries, including research administration. This includes advancements in natural language processing AI models, such as large language models (LLMs). LLMs are AI systems capable of processing and generating human-like text by analyzing vast datasets of human content, such as the internet and literature, to better understand speech syntax (Naveed et al., 2024, p. 2). These generative models referred to as genAI, can comprehend and produce natural language, enabling transformative applications across various sectors (Naveed et al., 2024, p. 3).

Despite their potential, LLMs present challenges, such as generating inaccurate or misleading information, commonly known as hallucinations. These occur when models produce plausible but incorrect content due to limitations in their understanding and reliance on patterns from their training data (Naveed et al., 2024, p. 15). To mitigate these risks, institutions have emphasized developing private models that limit external data sources, thereby enhancing the accuracy and security of AI outputs. Additionally, effective use of LLMs requires a practice known as prompt engineering, which involves refining queries to elicit accurate and relevant responses from the model (Naveed et al., 2024, p. 6).

In November 2022, OpenAI launched ChatGPT®, an advanced generative AI model. GPT stands for "Generative Pre-trained Transformer," a neural network architecture designed to generate coherent and contextually relevant text based on input prompts (Naveed et al., 2024, p. 10). Given the novelty and complexity of these technologies, conveying the potential of genAI to non-technical colleagues posed a significant challenge. To address this, a project team at Emory embarked on a proof-of-concept (PoC) project to explore the application of genAI in research administration.

This article presents a case study detailing the ORAgpt chatbot development, capturing real-world experiences and practical insights gained throughout the process. The ORAgpt chatbot, named after the Office of Research Administration (ORA), was developed as a private LLM-based assistant to streamline administrative tasks and support research operations. This initiative, led by the Office of Strategic Optimization and Training (OSOT) with the ORA, in collaboration with Goizueta Business School, aimed to provide research administrators with instant, accurate information on ORA processes and policies. The project demonstrated AI’s potential to transform research administration by reducing response times and ensuring information consistency. Secondary objectives included raising awareness of genAI technology among research administrators and conducting live demonstrations of ORAgpt (Konsynski, B. 2023). This case study highlights the development process, including team selection, platform choices, overcoming significant challenges, preparing for demonstrations, and navigating budgetary constraints.

Background

Recognizing AI’s potential, the interim assistant vice president of OSOT began collaborating with Leena AI in December 2022 to develop a ChatGPT-like sandbox customized for research administration. This external (buy) approach leveraged the vendor’s expertise for a rapid turnaround, allowing for a quickly developed sandbox model. This external model served to demonstrate the potential impact, value, and strategic benefits of integrating generative AI (genAI) into research administration. The collaboration with Leena AI was designed to inform a build-or-buy decision. It was later presented to Emory’s vice president for research in July 2023, marking a significant step toward revolutionizing research administration through AI.

While the external model demonstrated the potential of generative AI, the absence of allocated funding in the FY23 budget led to the decision to explore an internal development approach and not move further than the Leena AI sandbox. The decision was therefore to focus on an internally built chatbot.

In August 2023, the interim assistant vice president assembled an internal team to build a second proof of concept (PoC) for a private LLM/GPT solution. While the development platform had not yet been selected, the team named the solution and the bot "ORAgpt," after the Office of Research Administration (ORA), reflecting its intended purpose and organizational alignment. This dual approach was strategically chosen to address several key objectives:

Demonstrate Value: Develop a compelling AI demo to secure buy-in from senior leadership for further investment.
Identify Use Cases: Address real pain points in research administration by refining specific AI functionalities.
Establish Feasibility: Assess resources, cost, and team structure needed to scale from PoC to full production.

The internal PoC was successfully unveiled in October 2023, showcasing the project's rapid progression and strategic planning from the initial concept to the demonstration of both solutions.

Use Cases Driving the Project

Through discussions with new employees, it was discovered that they often experienced frustration due to the length of time it took for colleagues and peers to answer questions about specific job-related tasks. Some reported waiting hours or even days for responses, preventing them from completing their work on time. Additionally, many job aids and Standard Operating Procedures (SOPs) were outdated or difficult to follow.

One of the primary goals of this PoC was to demonstrate how an AI-powered solution could address these pain points by improving response times and ensuring consistent, accurate, and complete information delivery.

The vision and intent of the ORAgpt PoC project was to explore the new technology for feasibility in ORA. Traditional chatbots are well known tools for managing information requests from users, but the envisioned genAI assistant would push the current capabilities further. Virtual assistants with the capacity to create and update files would transform research administration and the way we perform work. A detailed description of each is provided below.

Virtual Assistant for Instantaneous Answers: A virtual assistant designed to offer 24/7 access to process and procedure information, enabling research administrators to receive immediate guidance. Such a tool can reduce delays at full scale by providing reliable, consistent answers to task-related questions. It can potentially enhance new hires' onboarding and training experience and decrease employees' time to achieve portfolio management proficiency. For instance, research administrators can quickly obtain information about award setup and closeout, system navigation, invoice processing, and other key policies and procedures. Benefits may include lower operational costs, better customer service, and reduced staff turnover (Wilson, 2023a).

Document Generation for SOP Updates: SOP development has long been time-consuming and inconsistent for ORA staff. With over 90 SOPs needing revision, outsourcing these updates proved cost-prohibitive, with vendor quotes starting at $50,000 per SOP. Leveraging genAI, the project aimed to streamline SOP development, eliminate outsourcing costs, and ensure consistency in style, language, and format. This use case aimed to assist research administrators in revising or creating new SOPs by guiding them through pre-set prompts, references, and templates, generating updated or new SOP documents. The goal was to standardize formatting, reduce development time, and update SOPs as needed using AI-powered document generation.

Student Collaboration

In collaboration with Emory’s Goizueta Business School George S. Craft Distinguished University Professor, the project team envisioned leveraging the latest AI technology and providing students with valuable hands-on learning experiences. A select group of students, affectionately known as the 'Students of Benn' (SOBs), were recruited, and directly involved in developing the private ORA genAI model.

The newly established capstone project for the students integrated academic theory with practical application, creating a dynamic learning environment and infusing the classroom with real-world business problem-solving experiences.

Collaborative Development

The project emphasized configuring the model to utilize select, curated knowledge resources from ORA’s Knowledge Repository, compiling and vetting SOPs, policies, training materials, and more into a shared directory. Unrelated, confidential, sensitive, personally identifiable information, proposals, awards, and Intellectual Property (IP) data were excluded from the LLM’s knowledge base, upholding data integrity and security standards. The charge was to manage the LLM’s output and response to user queries using only ORA-related data.

The student team initially tested platforms like Private GPT and FlowiseAI before the project team agreed on Microsoft Azure.

Constraints

The project encountered several significant constraints, particularly during the exploration phase. Key constraints included a limited budget, a self-imposed compressed timeline, and the critical need to mitigate data security and system integration risks. These obstacles necessitated meticulous planning and the implementation of robust risk management strategies to safeguard project outcomes. The tight schedule underscored the urgency to quickly demonstrate the potential of generative AI (genAI) capabilities, yet the project team prioritized carefully selecting a secure and scalable platform. Organizational approval of the chosen solution was paramount, especially given concerns about data confidentiality and system performance. The team faced the non-negotiable imperative of preventing potential risks to sensitive institutional data while ensuring the solution's scalability.

Each potential solution was rigorously evaluated against these stringent criteria. PrivateGPT, while promising in managing Personally Identifiable Information (PII), proved inadequate due to its slow response times and susceptibility to errors, making it unsuitable for live demonstrations. Similarly, FlowiseAI, another contender, demonstrated potential but ultimately underperformed in critical performance benchmarks. These limitations underscored the complexity of balancing security, scalability, and functionality in selecting a viable AI platform.

Theoretical Framework

This study is grounded in multiple organizational theories that provide a comprehensive framework for understanding the integration of AI into research administration. The primary theories applied include Weber's The Theory of Social and Economic Organization (1947), Knowledge Management Theory (Nonaka & Takeuchi, 1995), and Simon’s concept of Bounded Rationality (1957).

Weber's Theory of Bureaucracy emphasizes efficiency, consistency, and predictability in large organizations. AI's ability to automate routine tasks and provide standardized responses aligns with these principles, allowing research administrators to focus on higher-level responsibilities and improving overall organizational performance.
Knowledge Management Theory supports the integration of AI by highlighting its role in capturing, storing, and disseminating institutional knowledge. The ORAgpt model addresses the challenge of knowledge transfer by making critical, time-sensitive information readily accessible to research administrators, thus reducing delays and inconsistencies.
Simon's Bounded Rationality concept underscores the limitations of human decision-making due to available information and cognitive constraints. AI enhances decision-making by providing timely and relevant information, compensating for these limitations. However, the risk of generating incorrect or misleading information necessitates strong governance structures and ethical guidelines.

Furthermore, this work's theoretical underpinnings emphasize AI's potential to significantly improve administrative efficiency by delivering accurate and timely information. Drawing on theories of knowledge management and information systems, the study explores how AI can streamline operations, support decision-making, and optimize resource utilization in research administration. This is supported by Popenici and Kerr (2017), who explored the impact of artificial intelligence on teaching and learning in higher education, including its implications for administrative efficiency and effectiveness.

This theoretical framework guided the study from the formulation of core questions to the methods used and the qualitative exploration conducted. It also informed the conclusions drawn, emphasizing AI's potential to significantly improve administrative efficiency by delivering accurate and timely information.

At the time, much of the existing literature related to the utilization of AI focused on industries such as healthcare, ecommerce, finance, and marketing (Deloitte Center for Higher Education Excellence, 2023; Popenici & Kerr, 2017). While Deloitte (2023) discusses AI applications in healthcare, finance, and ecommerce, Popenici and Kerr (2017) specifically explore AI within higher education. Few discussions addressed the use of AI, or in this case generative AI, in the university research administration environment (particularly the integration of private LLMs).

This case study acknowledges, addresses, and contributes to existing gaps in the literature by providing practical insights into the exploration, deployment, and use of private LLMs within university research administration operations. The successful implementation of this proof-of-concept underscores generative AI’s significant potential to enhance research administration through improved knowledge retrieval, efficient creation and updating of SOPs and related process documentation, and other effective natural-language content generation.

Methodology

The core approach to deploying Emory ORA’s private LLM leveraged preview access to Microsoft Azure’s 2023 Generative AI model, made possible through Microsoft’s partnership with OpenAI. The OpenAI Studio GPT-3 engine was made available to licensed users and the public via an early production preview. Through Microsoft’s Access Request Form, approved users gained access to Azure AI Developer Studio, where developers could configure and deploy models within a controlled environment.

Requirements

Obtaining a license to use the Azure platform was the first step to developing the model. Since Emory already had a Microsoft enterprise agreement for Azure services, the team, including the students, only needed to deploy specific Azure resources and gain access to the new model. This required submitting an Access Request Form to Microsoft to request OpenAI Studio, where the use case and impact were reviewed before approval.

Once approved, the development team deployed the basic Azure architectural tools needed for customizing the model. The team then compiled and indexed the ORA knowledge documentation to serve as the foundation for grounding the bot. In OpenAI Studio, the developers selected and configured the model which became ORAgpt, Emory ORA’s private chatbot tailored for research administration. After grounding ORAgpt with Emory ORA data, it was deployed to an Azure-hosted web instance using Azure App Services.

Following deployment, the team conducted testing and evaluation, using various prompts to assess overall performance, accuracy, and responsiveness.

Payment Structure

The project team leveraged Microsoft Azure’s pay-as-you-go model, which provided flexible cost management while offering scalable resources as needed.

Pay-as-you-Go Approach

This pricing model enabled billing only for the resources used, with no upfront costs or termination fees. It was ideal for an unfunded PoC project, as resources could be scaled up or down dynamically based on demand.

Billing Details

The private ORAgpt chatbot’s architectural components were billed monthly based on the type and quantity of resources used. These included storage, databases, search and retrieval enhancement services, and interface instance components. Since each resource had its own pricing model, the team calculated initial and projected costs using the available Microsoft Azure Pricing Calculator to estimate initial and projected costs.

Billing was directly tied to resource scalability with costs increasing as storage expanded and decreasing when scaled down. This flexibility was ideal for the bot’s future development.

Azure invoices provided a detailed breakdown of all resources used and their associated costs. Several built-in tools, including cost analysis and budgeting, helped the team maintain transparency and effectively manage development expenses.

Qualitative Approach to Model Evaluation

The PoC aimed to demonstrate insights from OpenAI’s newly announced generative AI model, GPT-3.5 Turbo. The project team’s decision to conduct a qualitative analysis of the selected OpenAI-GPT 3.5 Turbo model was driven by its enhanced capacity to generate, summarize, and present search results using human syntax (natural language).

The testing phase evaluated the model’s output against ORA grant lifecycle source documentation, with stakeholder feedback assessing accuracy, completeness, and relevance. A key focus was “grounding,”—ensuring the model’s responses were derived exclusively from the provided knowledge documentation.

Another key objective was to determine whether the model relied on pre-trained knowledge or remained within research administration topics. This evaluation took place in the Azure AI Studio’s developer playground before deployment, where model parameters could also be adjusted based on performance. These parameters included the level of creativity in referencing documents and system prompts to influence behavior with end users.

The project team began the verification process to assess the accuracy and validity of the model’s output, focusing on whether the responses were appropriately aligned with the queries. To support this process, a shared Prompt Evaluation Log was created using Google Docs to document tester prompts and the model’s responses, allowing for systematic tracking of performance and quality. This log featured a “notes” section for commenting on interactions and identifying iterative prompts that may have influenced the output. In this detailed spreadsheet, the project team meticulously recorded each interaction with the AI model, capturing essential information such as the question ID number, the specific question asked, and the model’s response. Additional details recorded included the department category, the source document with the correct answer, the date the question was posed, and an analysis indicating whether the response was accurate. This structured and thorough approach allowed for a robust assessment and serves as a foundation for necessary adjustments. To strengthen the evaluation process, the development team held weekly meetings to share insights, discuss findings, and collaboratively reflect on the model's behavior. These meetings fostered a dynamic exchange of ideas, allowing for real-time improvements. This collaborative space also allowed team members to track the model’s development and modifications, providing a transparent record of ongoing refinements. By capturing a broad range of data points and fostering open dialogue, these recurring discussions maintained a rigorous and adaptive approach to optimize the model to meet the practical needs of research administrators.

In addition to accuracy and validity, responses were evaluated for completeness. The goal was to determine whether prompts were sufficiently answered and included necessary levels of detail. In other words, the team sought to determine if any output information was diluted through summarization or if the model could be prompted to provide varying levels of details from the source documents. This process also helped to identify gaps in documentation related to queried topics. If the responses were found to be insufficient, new files were indexed and the model was prompted again to assess any improvements in detail.

The project team also investigated format control—the model’s ability to generate tailored text output with specific elements such as headings, bullet points, tables, and font styles based on end-user prompts. They evaluated prompts that directed output text structure and format, such as bold headings, style, bullet points, key takeaways sections, and Q&A designs.

Through the investigation of these factors, the project team was able to identify the limitations or areas of opportunity for the model under basic architecture. The development process was inherently iterative, with the AI model undergoing continuous adjustments. The developer team continued to revise the AI model’s knowledge base to enhance response accuracy. This iterative loop emphasized the project's experiential nature, focusing on real-world application rather than formal experimentation, ensuring that improvements were responsive and practical.

Prompt engineering played a crucial role in shaping the AI model’s performance, with carefully designed prompts aimed at eliciting precise and contextually relevant responses. For example, by understanding and utilizing the ‘system prompt’ feature in the OpenAI studio developer space, many identified issues or opportunities could be mitigated and/or embedded in the immediate output behavior of the model before any user input. The reality was that the ‘system prompt’ preceded and accompanied any subsequent prompts from an end user, thus altering the output with predefined instructions from the developer side.

The end user would not see these tailored prior inputs and would only be aware of their initial prompt. A better understanding of the intended use case for the model would therefore lead to unique system prompts which can be used to influence or limit the responses of the model, essentially guiding how that response is presented to an end user. Areas of guidance included but are not limited to tone, focus, follow-up behavior, and specific instruction to not use pre-trained base knowledge (general knowledge) in its output (grounding).

Overall, extensive testing would be required to cover the broad volume and range of content that is provided as the knowledge base of a generative model. Therefore, the test logs should reflect and sample the full scope of subject matter the end user would seek.

Knowledge Data Volume and Model Response Speeds (Not Originally Tested)

Quantitative measures are equally important factors to consider when testing generative models before and during deployment. In this context, key assessment areas include system throughput (real-time or average user activity), response speed (query handling, search/retrieval lag time to data volume), and volume testing (stress factors).

These factors were outside the scope of the PoC as the primary goal was to introduce benefit, impact, and value of generative AI capabilities to the Emory University ORA community and stakeholders. Therefore, the testing phase focused on assessing information retrieval capacity and response quality.

Through the Microsoft Azure platform, the project team had access to projected billing cost, architectural resources, and various data dashboards that provided timestamped insights on service-level components, model performance, and end user activity. Identifying the availability and depth of these insights informed the need to create and track them separately.

Additionally, the cost of implementing private Generative AI models is a significant factor to consider in assessing the institution’s feasibility for larger-scale deployment. Cloud environments offer significant advantages as server storage and computational resources are quickly allocated to the end users (organizations), while infrastructure management is handled on the back end by the serving entity, Microsoft Azure.

Equally important, these types of low-code cloud-based platforms, architecture components, and model deployment approaches generate cost via Cloud Service Level Agreement (SLA) expenditures and tokens generation/processing fees.

Although the project adopted a practical testing approach, the documentation phase underscored the need for a more structured Testing, Evaluation, Validation, and Verification (TEVV) process in future iterations to ensure even greater rigor and reliability.

Assessing the Generative AI Use Case in ORA

Following the testing phase, the model's capabilities were thoroughly documented and supported by test logs. Using these logs, the next step was gathering feedback from ORA research administrator Subject Matter Experts (SMEs) before transitioning to real-time interactions with pilot groups.

The initial approach involved sharing the test log data with SMEs familiar with the subject matter files selected for the model, aiming to evaluate its relevance for departmental workflows and new hire onboarding inquiries. Research administrators were encouraged to provide feedback on an ad hoc basis, facilitating a flexible and continuous review process that accommodated work schedules and emergent observations.

Four volunteer SMEs from both pre-and post-award administration participated in the review process. The volunteers were selected based on peer recommendations or supervisors’ approvals and had mid-career to seasoned experience, defined as over five years in research administration. The decision to use mid-to-seasoned research SMEs rather than a broader group of evaluators, was based on the need to use evaluators that understood research administration processes. This approach ensured the completeness, level of detail, relevance, and accuracy of the model’s output.

SMEs had the opportunity to interact directly with the ORAgpt chatbot, posing research administration-related questions that mirrored real-world scenarios encountered by both new and seasoned research administrators. The questions were crafted to reflect typical inquiries related to daily tasks, responsibilities, and relevant policies, aligning with the SMEs' roles.

By evaluating the perceived usefulness of ORAgpt in key ORA areas, the team assessed how a fully deployed generative chatbot within the ORA community might perform. This is an important step for use case development as without prior testing and documentation of the bot’s generative features and capabilities, it would be difficult to envision its potential benefits or develop realistic expectations regarding impact.

Engaging SMEs helps validate the bot’s value and demonstrate the benefits of integrating generative technology into operational workflows.

Generative models are trained on vast amounts of human language syntax, making them incredibly proficient at producing and predicting strings of text. However, like all developed machine learning models, rigorous testing is essential to identify errors and ensure the consistency and quality of the data being provided to end users. Early and thorough issue detection further strengthens the business case for investing in Generative AI in Research Administration and enhances stakeholder confidence in its potential.

Selection and Curation of ORA Documentation

Deploying Generative AI models that respond using organizational data requires careful curation and identification of datasets, systems, and documentation to build reliable knowledge bases or corpuses. In Emory’s case, data collection was facilitated through multiple channels to ensure comprehensive and systematic evaluation.

Documents were selected from an established ORA Knowledge Repository and placed into a “working files” project folder for curation. Once curated, source data were subsequently stored (uploaded) and indexed in Microsoft Azure for model retrieval. The development team validated the accuracy and reliability of files before integrating into the LLM. This process was essential for ensuring that the information provided by the model met institutional standards of precision and relevance.

Project Foundation and Charter

Before recruiting and formally charging the internal development team, a comprehensive project charter (business case) was developed to provide a strategic and operational framework for the ORAgpt initiative. This charter outlined the project's purpose: to create a proof-of-concept AI chatbot using a LLM trained on curated documents from the ORA Knowledge Repository. The primary objectives were to validate the model’s feasibility, streamline research administration operations, and improve staff productivity by offering an AI-powered virtual assistant.

The charter specified key deliverables, including the deployment of a chatbot capable of generating SOPs, job aids, and training materials, as well as an AI demonstration at Emory’s ORA Research Week conference in the Fall of 2023. The need was driven by challenges such as the steep 180-day learning curve for new hires, inefficiencies in updating SOPs, and the need for instant access to answers to task-specific questions. Additionally, the charter defined key use cases and delineated both in-scope and out-of-scope content to maintain data security and relevance.

This well-structured foundation provided a clear direction for the project and set the stage for recruiting a diverse, cross-functional team of subject matter experts, students, and collaborators.

Project Team

Developing an LLM for research administration was a complex initiative that required extensive collaboration across the Offices of Research Administration, Information Systems, and Cybersecurity. The project was conceived, initiated, and sponsored by the Interim Assistant Vice President of the Office of Strategic Optimization and Training (OSOT) within ORA, who provided high-level oversight to ensure alignment with the institution's organizational goals and strategic vision.

The Director of Cybersecurity played a crucial role in Azure resource deployment, configuration, subscription management, and security-related subject matter, ensuring the model's infrastructure was secure and compliant with institutional policies.

Faculty from the Goizueta Business School, with expertise in information systems and operations management, offered critical insights into AI applications while upholding rigorous ethical and technical standards. Graduate and undergraduate students from the Goizueta Business School worked alongside faculty in the technical development and refinement of the chatbot. Students of Benn (SOBs) applied academic training to configure the Azure-based model, optimize performance, and troubleshoot technical issues, gaining invaluable hands-on experience in implementing generative AI technologies. This faculty-students collaboration was essential for successfully integrating theoretical knowledge with practical application and fostering innovation in AI-driven research administration solutions.

The project support specialist from OSOT played a pivotal role in coordinating implementation, including curating essential documentation from the knowledge repository, managing the project timeline, facilitating collaboration among team members, and leading the content delivery for the proof-of-concept demonstration.

Seasoned research administrators from ORA contributed in-depth content expertise ensuring the chatbot's knowledge base accurately reflected institutional policies, procedures, and compliance guidelines.

A strategic partnership with Microsoft further enhanced the project. A specialized team of data and AI experts dedicated to advancing technology in education provided essential technical guidance and support. Their expertise was crucial for seamlessly integrating Azure-based AI tools and optimizing the LLM's overall functionality and efficiency.

Academic scholars, administrative professionals, cybersecurity specialists, and external technology experts collaborated to create a comprehensive proof-of-concept model. This interdisciplinary effort underscored the project's complexity and set the stage for future advancements in the application of artificial intelligence within research administration. The initiative highlighted AI’s potential to drive transformative improvements in efficiency, accuracy, security, and strategic decision-making.

Project Management Implementation Plan

The development of the ORAgpt chatbot followed a structured yet adaptive project management approach, balancing known requirements with emerging insights as the team navigated a new AI-driven landscape. The project management methodology was chosen because of its proven approach and applicability toward increased organization and reproducibility: capturing the resulting tasks/steps for deployment of a generative chatbot.

Plan Development

The team created a Work Breakdown Structure (WBS) in Excel, listing known steps for project initiation in sequential order. The charter was created first and later, it informed the business plan, defining the project's purpose, objectives, scope, and risk mitigation strategies. The next step involved identifying the project lead and assembling a cross-functional team. As part of this phase, the team explored potential LLM development platforms and evaluated their technical feasibility to determine the best fit for the project.

Unknown Action Items: Iterative Development & Agile Adjustments

Since AI chatbots in research administration were still an emerging technology, the team did not have predefined implementation steps for specific generative chatbot deployment. Instead, the project plan WBS evolved simultaneously with development. The technical team conducted extensive research into AI development platforms, consulting experts to gain insight into best practices. As findings emerged, action items were updated dynamically, allowing the team to remain agile.

Managing Critical Path & Execution Strategy

Given the project's ambitious timeline, the team set a self-imposed six-week deadline to demo the chatbot at an upcoming conference. With every component being critical to the project’s success, delays in any area, whether technical, content-related, or testing, could have impacted the demo. To operate efficiently, the team formed parallel workstreams. One group focused on LLM learning and chatbot development, while another worked on identifying and testing prompt questions for accuracy. This second group also prepared for both a live demonstration and a pre-recorded version as a backup to mitigate any technical risks.

Tracking Progress & Adapting the Plan

To maintain momentum, the team conducted daily check-ins, tracking real-time progress, and addressing roadblocks as each arose. The project management approach allowed the team to track and manage changes to tasks and or changes in scope. While there were opportunities for changes, the team maintained strict adherence to the predefined use cases, SOP updates, and 24/7 task-related questions, preventing scope creep. The use of Gantt charts allowed effective communication to parallel teams, monitoring task dependencies, ensuring timely execution, and for reporting project progress to stakeholders.

Project Communication

Due to the self-imposed deadline, the team did not implement a formal communication plan but instead relied on real-time collaboration. Regular Zoom and Teams calls facilitated virtual meetings, while in-person discussions allowed for more direct problem-solving. Email and text messaging were used for quick updates. Daily meetings ensured continuous alignment among research administration staff, business school developers, and IT personnel, allowing the team to stay synchronized despite the rapid development cycle.

Scope & Timeline Management

Despite the challenges encountered, the project remained on track, meeting the six-week deadline as originally planned. No major changes or delays were introduced, as the team had clearly defined objectives and adhered to their structured project timeline.

Architecture Design and Development Details

The architecture for the proof-of-concept (PoC) LLM was meticulously crafted using a suite of Microsoft Azure cloud services to create a secure, scalable, and efficient system tailored to the unique demands of research administration (Microsoft, n.d.).

The development process was initiated with Azure Storage, a service that provides a robust, secure, and highly scalable cloud-based storage solution. This service housed sensitive and confidential organizational data, ensuring data integrity, and enabling seamless access for subsequent indexing and querying. Before storage, the data underwent a rigorous review and preparation phase to ensure that only the most relevant and vetted information was included in the model’s knowledge base, a critical step for delivering precise, AI-driven responses aligned with institutional requirements.

Once the data was securely stored, Azure Cognitive Search was deployed to index the content. This search-as-a-service tool utilizes artificial intelligence to extract insights from both structured and unstructured data, transforming it into a comprehensive, full-text searchable index. Azure Cognitive Search was a critical intermediary between data storage and the language model by facilitating complex and efficient search queries. It enhanced the efficiency and precision of the model’s outputs. It supported API calls and allowed for sophisticated search and retrieval functionalities, thereby significantly improving response accuracy (Microsoft, n.d.).

The core natural language processing capabilities were powered by the OpenAI GPT-3.5 Turbo model, accessed through OpenAI Studio. This model played a pivotal role in interpreting and processing user queries, converting them into actionable search requests executed by Azure Cognitive Search.

The advanced natural language understanding of GPT-3.5 Turbo-optimized the search experience, generating both contextually appropriate responses compliant with institutional guidelines, thus meeting the lofty standards required for research administration tasks (Microsoft, n.d.).

To ensure seamless user experience, the model’s responses were delivered through Azure App Services, a fully managed platform designed for building, deploying, and scaling web applications and APIs. Azure App Services provided a streamlined, user-friendly interface that supported interactions across multiple platforms, including web pages, mobile applications, and chatbots. This multi-platform accessibility allowed research administrators to interact seamlessly with the AI model, making it a practical and effective tool that supports daily tasks and operational needs.

Although Azure Synapse Analytics was not utilized in the initial deployment phase, it was identified as a key component for future scalability. This integrated analytics service combines big data and data warehousing capabilities, offering data ingestion, preparation, and complex analysis functionalities. Integrating Azure Synapse Analytics in the future would enable the system to handle larger, more complex datasets and support comprehensive data analysis, thereby expanding the model’s capabilities and enhancing its overall utility for research administration (Microsoft, n.d.).

Figure 1. PoC Azure Architecture: Workflow from data collection to end-user interface, including Azure Synapse Analytics (for future big data integration), Azure Storage, Azure Cognitive Search, and the OpenAI GPT-3.5 engine.

The architecture of the PoC LLM demonstrated a robust and well-considered approach to integrating artificial intelligence into research administration. The design emphasized data security, operational efficiency, and scalability by leveraging Microsoft Azure's secure and scalable infrastructure. This thoughtful deployment laid a solid foundation for future enhancements, supporting the long-term vision of integrating AI-driven solutions into administrative processes for improved efficiency, accuracy, and strategic decision-making (Microsoft, n.d.).

To ensure data security and mitigate risks, the ORAgpt chatbot operated within Emory’s secure network infrastructure, with access restricted to ORA staff. The model’s training data was curated from vetted internal resources, such as SOPs, FAQs, checklists, training materials, and sponsor guidelines, while explicitly excluding any confidential or sensitive data, including personally identifiable information and proprietary content. The project adheres to ethical guidelines and compliance measures to safeguard data privacy.

Model Functionality: GPT Model Customization and Use of Private Data

The GPT model’s behavior was customized through the "System Message" configuration within Microsoft Azure’s OpenAI Studio, as shown in Figure 2. In this setup, the chatbot was designated as “ORAgpt” and assigned a specific role as a “support assistant designed to help research administrators find and create necessary information.” This system-level configuration enabled the project team to comprehensively define the model’s language style, behavior, and outputs. Customization features included crafting tailored opening responses, follow-up interactions, and ensuring the chatbot could cite relevant source documents.

Figure 2. Configuring the Model and Bot’s Behavior: Developer playground in Microsoft Azure’s OpenAI Studio.

Maintaining the model’s grounding exclusively within the provided dataset was crucial for ensuring response accuracy and relevance. The configuration process included strict adherence to the parameters set for the model, effectively reducing the risk of generating inaccurate or misleading information—commonly known as "hallucinations" in AI terminology. This meticulous approach allowed ORAgpt to operate within clearly defined boundaries, delivering dependable and context-appropriate answers tailored to the needs of research administration staff.

Risk Management and Scope of Source Data

Developing ORAgpt involves recognizing several key risks and implementing effective mitigation strategies. The project charter documented these risks and the corresponding strategies to ensure the project’s success and safeguard sensitive data throughout the development process.

Scope of Source Data

A fundamental part of minimizing risks was carefully defining the scope of the data corpus used for the proof-of-concept (PoC). The in-scope documents came from the ORA Knowledge Repository and included SOPs, checklists, job aids, and policies. These documents addressed critical tasks, especially post-award closeout processes, where staff often faced difficulties. One of the most significant challenges encountered was the reliance on outdated SOPs as training data. The chatbot generated responses directly from these documents, but because many policies and procedures had changed without being formally updated, the bot’s answers often conflicted with end-user expectations. This discrepancy led some users to perceive the chatbot as inaccurate, even though it correctly reflected the data provided. While acknowledging that some SOPs were outdated, the primary goal of the PoC was to validate the model’s functionality rather than ensure comprehensive, current content. Therefore, the emphasis was on quickly incorporating documents into the corpus, with plans to sunset the PoC and implement a more rigorous vetting process in a later phase.

These challenges of using outdated SOPs as training data highlighted a broader institutional issue—the need for timely updates to organizational knowledge bases to ensure AI tools remain relevant and effective.

Out-of-scope content included confidential, sensitive, and proprietary information such as PII, proposals, awards, and IP. To maintain strict data security, the integration of third-party applications and access to external internet-based knowledge sources were also excluded. Documents like 2 Code of Federal Regulations (CFR) 200 – Uniform Guidance, the National Science Foundation (NSF) Proposal & Award Policies & Procedures Guide (PAPPG), and National Institute Health (NIH) guidelines were initially considered but excluded due to time constraints. Future iterations will include these external guidelines as part of an expanded and more refined data set.

Risk Management Strategies

Network Security

One of the primary risks identified was the potential for data breaches or unauthorized access to the system. ORAgpt was hosted on Microsoft Azure within Emory’s secure network infrastructure to mitigate this. Specific measures included restricted access protocols, allowing only authorized ORA staff to interact with the system. Additionally, Emory's IT department conducted rigorous security audits and reviews to ensure compliance with institutional and network security standards, employing practices like data encryption and regular security updates.

Data Privacy and Security

Data privacy risks included the possibility of inaccuracies, data breaches, and copyright violations. To address these, the data set used for the AI model consisted of over 31 rigorously vetted ORA documents. This vetting process confirmed the suitability and relevance of each document, ensuring that no confidential, proprietary, or personally identifiable information was included. Data protection measures, such as restricted data permissions and regular audits, reinforced the model’s security framework and safeguarded against data misuse.

Responsible AI: Addressing Compliance, Bias, Transparency, and Accountability

Ensuring compliance with government and university guidelines was crucial to the project. AI regulations, at the time, in the United States, the European Union, and Emory University were still in the early stages. The European Union released its first Artificial Intelligence Act as law on July 12, 2023, with enforcement set to begin on August 2, 2026 (European Union, 2024). The U.S. released its Executive Order-14110, Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence on October 30, 2023, taking effect immediately. Emory convened its first AI governance working group Spring of 2023 with the goal of developing a governance framework for AI.

The framework initially focused on the responsible use of generative AI, particularly in academic integrity, research, and privacy.

Because formal AI-specific guidance from the U.S. and E.U. was still evolving—and Emory had not yet issued institutional guidance—during the summer of 2023, the development of the ORAgpt chatbot took place in a regulatory vacuum. Still, the team adhered to existing and overlapping data security and sharing policies. This included internal ethical AI standards and best practice frameworks, including those outlined by the National Institute of Standards and Technology (2022).

Evolving government and University regulations were continuously monitored throughout the development timeline to ensure compliance and uphold ethical practices. To address transparency concerns, the team disclosed details about the purpose, scope, and selection process of documents used to build the model's corpus to SMEs and stakeholders.

All things considered, with reliance on SME and institutional documents, there is an inherent risk of bias in human judgment and the AI model’s output. While SME judgment and the language within source materials can introduce biased performance evaluations and knowledge dissemination, the project remained focused on ORA policies and procedures. Policy documentation serves to guide ethical behavior and compliance. Procedures are intended to ensure task consistency and accuracy. To facilitate the collection of relevant feedback, the team selected source documents familiar to stakeholders and verified them for sensitivity and secure use.

Documents containing demographic language, unethical viewpoints, or other accountability-related content were not considered, as they fell outside the scope and objectives of the generative proof-of-concept. Additionally, grounding ORAgpt exclusively in institutional documents prevented potential bias from external internet sources.

As a result, the team recognized these mitigation factors related to bias, transparency, and accountability while maintaining a focused approach solely on model performance—safeguarding accuracy and verifying predefined features of generative AI capabilities.

Late Adoption and Strategic Positioning

Another significant concern was the risk of late adoption and the potential for Emory to fall behind in AI advancements. To address the risk of late adoption and ensure Emory remained competitive in AI advancements in research administration, the project prioritized early AI adoption and invested in continuous training for research administrators. The project fostered a culture of innovation by positioning Emory as a frontrunner in integrating AI technologies. It ensured that staff had the necessary skills to leverage AI tools effectively. This proactive approach aimed to secure Emory's competitive edge in research administration.

Future Planning and Impact Assessment

To ensure the continued effectiveness of these risk mitigation strategies, the project will incorporate a more structured document vetting process in the pilot phase. This approach will ensure that only current and accurate documents are used in the corpus, enhancing the model's reliability. Network security measures will be subject to ongoing testing and updates, while compliance practices will be reassessed as guidelines evolve. Data privacy protocols will be refined to align with emerging best practices, and the AI adoption strategy will be reviewed periodically to sustain Emory’s leadership in research administration innovation. The project remains committed to maintaining a secure, compliant, and forward-thinking AI infrastructure by addressing immediate and long-term risks. Naturally, with infrastructure comes cost. The Azure platform as a service model invokes the need to understand how architectural decisions (i.e. additional services or resources) augment billing.

Cost Management and Budget Considerations

Managing costs was a critical component of the ORAgpt project requiring transparent expense tracking and strategic resource allocation. The project team leveraged Microsoft Azure's platform, effectively monitoring and controlling expenses, ensuring transparency, and scalability based on resource usage. Understanding the costs associated with deploying cloud services, such as Microsoft Azure, was fundamental to the project's financial planning. This included itemizing the architecture and projecting costs related to data throughput, which allowed for a more accurate budget assessment.

The project team used the Microsoft Azure pricing calculator to review estimated service costs before resource deployment. This provided a comprehensive understanding of potential expenditures, essential for crafting an informed business case and communicating budget requirements to sponsors and stakeholders.

Azure services incur charges over time, including costs for storage operations (indexing, reading, and writing), token generation (input/output), and additional AI services. The team managed the budget efficiently by actively monitoring these costs, ensuring that expenditure aligned with the project timeline and available funding. The Azure Cost Management and Analysis tools offered insight into cost distribution, highlighting the most cost-intensive services, and informing budget adjustments, as necessary.

Figure 3. Example: Microsoft Azure Cost Analysis Tool.

This illustration demonstrates how developers can monitor resource expenditure using itemized breakdowns for monthly, hourly, and projected billing.

Note: The cost analysis shown is not specific to this project.

The team carefully selected Azure resource components and service plans based on project needs to optimize cost efficiency. Azure’s scalability allowed resources to be adjusted dynamically, scaling up or down as needed. This approach ensured cost-effectiveness, as resources were only scaled in response to specific demands, maintaining alignment with the budget.

Monthly costs were calculated based on actual resource usage, including user token generation and Azure AI Search service fees. By regularly monitoring these monthly expenses, the team could adjust resource usage proactively, maximizing cost efficiency throughout the project.

Specific Cost Management Considerations for Model Services

Specific cost items unique to AI model services, such as token usage, required careful consideration. User token generation incurred expenses based on the number of tokens processed by the chatbot GPT model, costing $0.002 per 1,000 tokens.

Azure Cognitive Search services also contributed significantly to the project’s overall costs. These services were essential for efficient information retrieval, with expenses varying according to the project’s scale, the range of resources selected, and the number of Search Units (SUs) allocated. The SUs determined capacity and usage patterns, influencing the service's overall cost and performance. Balancing functionality with budget considerations was critical, so the project frequently leveraged free-tier resource deployments, when possible, to minimize expenses.

Budget Constraints and Resource Optimization

Operating under a limited budget, the project relied on resourcefulness and the contributions of volunteers, including students and staff. The team initially used Azure’s free $200 introductory credits or one month access to reduce costs, which supported early prompt testing and model verification. Once these credits were exhausted, the students’ professor provided additional financial support, paying for the Azure subscription costs of $70 per month, sustaining the project. The professor’s lighthearted remarks about the expense added a positive and humorous atmosphere to the project meetings. The overall cost for the ORAgpt PoC was estimated at less than $1,000.

This financial support was vital in extending the project's duration beyond its initial phase, allowing for further testing and optimization. Monitoring minimum Azure architecture expenditures helped project costs for future pilot phases, ensuring the team could plan for anticipated expenses and budget accordingly.

Given the budget limitations, testing efforts were focused on essential components, and the testing duration was carefully managed. This approach maximized the value derived from limited financial resources, ensuring that all critical project objectives were met efficiently and effectively.

Results

The AI chatbot, ORAgpt, underwent rigorous testing to ensure reliability and accuracy. Testing involved approximately 31 questions related to research administration, policies, and procedures, yielding an accuracy rate of over 90% during later tests. The chatbot's response times were significantly faster than traditional methods, suggesting substantial time savings for real-world applications. For example, when prompted with, "How do I use the [Financial Outlook Reconciliation Tool] FORT for budget projection?" ORAgpt provided detailed step-by-step instructions that were consistent with official procedures. Another successful case involved guiding users through submitting an IRB application, where the chatbot presented a comprehensive guide covering each step, required documentation, and submission timelines.

The AI model effectively handled diverse content formats, including text-based PDFs, image-based files, and Excel sheets. Initially, the model struggled with robustness of response, but accuracy improved significantly after implementing enhancements like Azure AI Search's Vector Search and Semantic Ranking capabilities. These additions refined the model’s search functions and improved output relevance. Due to time constraints, testing on capabilities like generating or revising SOPs was deferred to the next project phase.

Live demonstrations of ORAgpt during the Research Week conference on October 31, 2023, marked a significant milestone. The chatbot was showcased just six weeks after the project’s inception, and feedback from cross-departmental senior leaders provided a balanced view of the model's effectiveness, noting areas for improvement. Despite these areas for growth, the initial project goals—raising awareness of AI’s potential, creating a generative chatbot, and performing live demos—were successfully met (Yisrael, T., & Konsynski, B., 2023).

Positive feedback from ORA staff underscored the model’s potential to streamline administrative processes and enhance productivity. By demonstrating the chatbot's capacity to generate clear, accurate, and standardized responses, ORAgpt proved its value as a tool that could significantly reduce response times and administrative burdens. However, the project was not without obstacles.

Challenges and Obstacles

The project faced several obstacles, beginning with budgetary constraints. Contributors, including students and staff, volunteered their time, and Azure’s introductory $200 credits were initially used to test the model. Upon exhausting these credits, the student’s professor contributed additional funds, highlighting the project’s reliance on limited financial resources. Resource expenditures, such as monthly fees for Azure AI Search, data storage operations, and token generation, limited testing activities and required careful cost management.

Platform selection was another challenge. Various options, including PrivateGPT and FlowiseAI, were tested before Microsoft Azure was chosen. PrivateGPT provided a Python-based terminal interface, while FlowiseAI offered a low-code platform for API integrations. Azure Cloud Services best met the project’s needs despite initial unfamiliarity with the platform.

Coordinating student schedules and managing varying levels of expertise require flexibility. With other academic obligations, students had to juggle project work with coursework, complicating scheduling and extending office hours. Moreover, the six-week timeline added pressure, demanding efficient document selection, promotional content creation, and model testing.

Technical challenges arose, including model resets and permission-related issues, sometimes leading to inaccessible chatbot instances. These problems, often stemming from bugs or user errors, necessitated rebuilding parts of the model. Despite the setbacks, the team gained proficiency in deploying chatbots and established multiple contingency plans.

Nevertheless, the initiative demonstrated Emory’s commitment to leveraging technology for operational excellence, setting a benchmark for research institutions worldwide. The ORA internal newsletter communicated the project's key results and goals to the Emory community, and senior leadership expressed excitement, stating, "We are excited to announce a groundbreaking initiative to revolutionize operations and enhance efficiency in the Office of Research Administration".

Discussion

Lessons Learned

The lessons learned from this project extended beyond technical development and provided key insights into effective project execution, stakeholder engagement, and long-term feasibility.

The initial dataset, which contained outdated SOPs, highlighted the importance of using current, well-curated documents. Data quality and accuracy play a crucial role in a generative chatbot’s effectiveness. The project demonstrated that an AI chatbot is only as strong as the quality of its training data. Outdated SOPs led to incorrect responses, even though the model itself functioned correctly. This reinforced the importance of maintaining up-to-date institutional knowledge bases. Although the outdated data was a practical choice for the proof-of-concept, projects must prioritize content accuracy to avoid confusion. Future phases of ORAgpt will implement a formal pre-deployment SOP validation process to ensure all included SOPs reflect current practices.

Managing user perception was also essential. AI skepticism emerged when chatbot outputs contradicted existing institutional knowledge. Effective communication and user education were necessary to ensure that stakeholders understood the model’s reliance on its provided data and to prevent misconceptions about AI performance.

Sustained investment is a prerequisite for long-term success. Although the proof-of-concept was successful, the project was put on hold due to the lack of financial planning for full-scale deployment. AI initiatives require dedicated funding beyond the pilot phase to ensure their continued development and integration. Future phases will include creating a formal budget proposal and presenting it to senior leadership to secure dedicated financial support before scaling beyond the proof-of-concept phase.

Real-time collaboration proved to be a key factor in the project’s agility. The informal yet highly responsive communication model enabled rapid problem-solving and alignment across teams. This flexibility allowed the project to meet its ambitious deadline despite the challenges encountered.

These lessons provide a roadmap for institutions looking to implement AI-driven solutions in research administration, ensuring both technical and strategic success.

Improvements and Recommendations

Several recommendations emerged from the proof-of-concept phase. First, continuously updating the data corpus with current policies and guidelines is essential. A dedicated content curation team could support this effort and ensure data accuracy and relevance.

Building in-house expertise is recommended to reduce dependence on external vendors, enhance project control, and lower costs. Structured stakeholder feedback loops will also be necessary for iterative improvements, fostering transparency and collaboration.

Adopting agile methodologies and scheduling regular progress reviews will improve project management. These measures will help address issues promptly and keep stakeholders aligned with project objectives.

Developing a comprehensive charter or business plan is essential. Clearly defining the chatbot’s purpose, objectives, risks, and institutional value helps align the initiative with organizational priorities before committing resources to development.

Assembling a committed cross-functional team is critical. A dedicated group of technical developers, research administrators, and project managers ensures the initiative can be executed effectively within a compressed timeline.

A cost framework should be developed before seeking investment. Financial projections must account for both the cost of building the proof-of-concept and scaling the model to full deployment. Presenting this financial roadmap prior to requesting funding enhances the likelihood of securing institutional backing.

Upcoming phases of this initiative will integrate structured SOP validation, expand testing with a larger validation set, introduce flexible timelines to accommodate technical challenges, establish a formal governance and communication framework, secure dedicated financial support through an official budget proposal, and align with Emory’s AI governance working group to ensure compliance with evolving regulatory and ethical standards.

Emerging Trends and Future Directions

The project revealed significant potential for AI to transform research administration at Emory. Emerging capabilities include leveraging generative AI models to analyze complex data sets, producing interactive dashboards, forecast research funding trends, and draft detailed impact statements for grant proposals and reports. The integration of AI could also streamline the creation of dynamic content, such as PowerPoint presentations tailored for research updates or Excel reports that automate budget projections and financial analysis. Additionally, there is potential for automating coding tasks to support administrative processes, such as database management or compliance tracking. Expanding Emory’s curated data sources and providing comprehensive, prompt training for research administrators will be crucial to fully harnessing these advancements. Future efforts could also focus on performing staffing analyses and balancing workload to optimize human and AI-driven resources, enhancing efficiency and strategic decision making across the university’s research administration landscape. Scaling ORAgpt will address complexities in research administration, train new staff, and improve quality control. Anticipated features include the integration of multi-modal AI capabilities and automated task execution. Developing pilot projects, like the recommended post-award focus, will guide the scaling process and ensure the model’s continued relevance and utility.

Resource Accessibility: Alternatives for Deploying Private Generative AI Models

Emory’s ORA proof-of-concept chatbot for a research administration-focused LLM was developed using a low-code approach via OpenAI Studio on the Microsoft Azure platform. Institutions with varying levels of resources and access to enterprise-level cloud solutions may explore similar approaches, such as OpenAI’s Private GPTs, Google’s Vertex AI, or other pre-trained generative AI models like Meta’s Llama, which can be hosted on local or cloud-based infrastructure. When selecting an AI platform, institutions should evaluate pricing models (pay-per-use vs. subscription), computational resource requirements, and scalability costs. Additionally, collaborating with institutional IT team and AI governance committees is essential to ensure secure deployment, policy compliance, and long-term sustainability of AI-driven solutions in research.

Acknowledgments

We want to acknowledge the contributions of the entire development team, including staff from the Office of Research Administration and Goizueta Business School, for their invaluable insights and support throughout the project. Special thanks are extended to the technical team at Microsoft Azure and the graduate students developing the AI model.

The success of the ORAgpt LLM project was made possible through the dedicated efforts of our development team, volunteer contributors, and stakeholders. Lisa Wilson, who originated the concept for the project, served as the principal lead—coordinating cross-functional teams, overseeing strategic planning and stakeholder engagement, and authoring key sections of the manuscript, including the introduction and discussion. Her leadership ensured the alignment of technical development with research administration needs.Benn Konsynski, PhD, played a significant role by covering the monthly subscription costs, infusing humor into our meetings, offering his students the chance to work on the project, promoting our story to enhance the project's visibility, and authoring key sections of the manuscript. Tubal Yisrael, Project Support Specialist, contributed significantly by mastering Microsoft Azure, managing the project plan, authoring sections of methodology results, and working alongside the students, and authoring key sections of the manuscript, including the methodology. Geoffrey Parson, Director of Cybersecurity, volunteered his expertise to focus on architecture, system integration, and security. Ethan Norwood, the Graduate Student Team Lead, dedicated his time to fine-tuning the system, alongside the graduate and undergraduate "Students of Benn" from Goizueta Business School, who contributed to various roles.

We also extend our gratitude to the Microsoft Azure Team—April McGuire, Bill Campman, Dominick Dennisur, George B. Freiberger, Katie Smith, Olivia Henshaw, and Vaibhav Pandey—whose assistance and guidance were crucial to the project's success. Additionally, we are grateful to the subject matter experts from ORA, including Davion Johnson, Brian Miller, Ashunti Gore, and Nicole Bell, for validating responses and participating in demonstration videos. Special appreciation goes to Aimee Kendall Roundtree, PhD, and Adrianne Taylor Wilson, PhD, for their academic and editorial support. Robert Nobles, PhD, for his leadership and encouragement in exploring innovative strategies.

Lisa A. Wilson
Assistant Vice President, Strategic Optimization and Training
Office of Research Administration, Emory University

Benn Konsynski, PhD
Professor
Goizueta Business School, Emory University

Tubal Yisrael
Project Support Specialist, Strategic Optimization and Training
Office of Research Administration, Emory University

Corresponding Author:

Correspondence concerning this article should be addressed to Lisa Wilson, Assistant Vice President, Strategic Optimization and Training, Office of Research Administration, Emory University. lisa.wilson@emory.edu

References

Deloitte Center for Higher Education Excellence. (2023). The future of research administration: Adapting to thrive. Deloitte. https://www2.deloitte.com/content/dam/Deloitte/us/Documents/public-sector/the-future-of-research-administration-adapting-to-thrive.pdf

European Union. (2024, July 12). Regulation (EU) 2024/1689 of the European Parliament and of the Council of 13 June 2024 laying down harmonized rules on artificial intelligence and amending Regulations (EC) No 300/2008, (EU) No 167/2013, (EU) No 168/2013, (EU) 2018/858, (EU) 2018/1139 and (EU) 2019/2144 and Directives 2014/90/EU, (EU) 2016/797 and (EU) 2020/1828 (Artificial Intelligence Act). Publications Office of the European Union. https://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=OJ:L_202401689

European Union. (2024, August 1). Implementation timeline. Future of Life Institute (FLI). https://artificialintelligenceact.eu/implementation-timeline/

Konsynski, B. (2023). ORAgpt project in Appcology. Emory University. https://bit.ly/ORAgptPlaylist

Microsoft. (n.d.). Azure products by category. https://azure.microsoft.com/en-us/products/category/

Nonaka, I., & Takeuchi, H. (1995). The knowledge-creating company: How Japanese companies create the dynamics of innovation. New York: Oxford University Press.

National Institute of Standards and Technology. (2022, August 18). Artificial intelligence risk management framework: Second draft. U.S. Department of Commerce.
https://www.nist.gov/document/ai-risk-management-framework-2nd-draft

Naveed, H., Khan, A. U., Qiu, S., Saqib, M., Anwar, S., Usman, M., Akhtar, N., Barnes, N., & Mian, A. (2023). A comprehensive overview of large language models. arXiv.org https://doi.org/10.48550/arXiv.2307.06435

Popenici, S.A.D., Kerr, S. Exploring the impact of artificial intelligence on teaching and learning in higher education. RPTEL 12, 22 (2017). https://doi.org/10.1186/s41039-017-0062-8

Simon, H. A. (1957). Rationality and administrative decision making. In Models of man: Social and rational (pp. 196–198). Wiley.

United States Government. (2024, October 30). Executive Order 14110: Safe, secure, and trustworthy artificial intelligence. Federal Register Office of the Federal Register, National Archives, and Records Administration (NARA). https://www.govinfo.gov/content/pkg/CFR-2024-title3-vol1/pdf/CFR-2024-title3-vol1-eo14110.pdf

Weber, M. (1947). The theory of social and economic organization (A. M. Henderson & T. Parsons, Trans.). New York: Oxford University Press.

Wilson, L. (2023a, August 11). Office of Research Administration to Revolutionize Operations with Cutting-Edge AI ChatGPT Integration. Emory University.
https://scholarblogs.emory.edu/orastaffnews/2023/08/11/harnessing-ai-technology-for-enhanced-efficiency-in-emory-research-administration

Wilson, L. (2023b, November). Unlocking opportunity: Leveraging generative AI in university-sponsored program operations. In E. Ruiz & G. Belle (Chairs), CUNY IT Conference 2023. Symposium conducted at the meeting of the CUNY IT Conference 2023, New York, NY.

Wilson, L. (2023c, November). AI in research administration: Promises, problems, and possibilities panel discussion. In C. Callahan (Chair), Research Week 2023. Symposium conducted at the Research Week 2023 Concurrent Session meeting, Atlanta, GA.

Yisrael, T., & Konsynski, B. (2023). ORA chatbot proof of concept video. Emory University. https://bit.ly/ORAgptPlaylist

Supplementary Content: List of Sample Prompts and Model Responses

This section contains a curated list of sample prompts tested with ORAgpt and the chatbot’s responses. The prompts are categorized by their type, including Training FAQs, Role, and task-related Queries, Internal and External Guidance, and Award Management. Each entry describes the prompt, followed by a summary of ORAgpt’s response.

Training FAQs

Prompt: What research administration training is offered by ORA?
- Response: ORAgpt provided a detailed list of available training programs, including self-paced and instructor-led courses. The chatbot emphasized the importance of continuing education for research administrators.
Prompt: How many professional development hours are needed for ORA?
- Response: ORAgpt outlined the required hours, citing Emory’s professional development policy and offering links to additional resources.
Prompt: Where can I find the professional development tracking log?
- Response: The chatbot directed users to the ORA website, specifying the location of the tracking log and how to access it efficiently.
Prompt: What are the training requirements for staff involved in NIH clinical trials at Emory?
- Response: A comprehensive explanation of NIH training requirements was provided, including required certifications and how to maintain compliance.

Role, Responsibilities & Task-Related Queries

Prompt: What system will help me determine the status of the CT agreement?
- Response: ORAgpt listed systems such as the Clinical Trial Management System (CTM) and detailed features for agreement status tracking.
Prompt: How can I access the FORT, and what features does it offer for clinical trial financial management?
- Response: The chatbot explained the FORT access process and described the features available for financial management, including budget projections and expense tracking.
Prompt: What is the Award Closeout process?
- Response: ORAgpt outlined the steps in the award closeout process, highlighting timelines, documentation requirements, and key responsibilities.

Internal Guidance

Prompt: Tell me about Research Training and provide details on the continuing education policy requirements.
- Response: The chatbot provided an overview of research training programs, emphasizing continuing education policies, and offering guidance on meeting these requirements.
Prompt: How will eNOAs be distributed?
- Response: ORAgpt specified the electronic Notices of Award (eNOAs) distribution method, including the departments responsible for distribution.
Prompt: What is the escalation process if I am unable to validate and/or resolve a financial compliance or reporting issue?
- Response: The response included a step-by-step escalation process detailing whom to contact at various stages and emphasizing the importance of timely communication.

External Guidance

Prompt: What is Research.gov?
- Response: ORAgpt briefly explained Research.gov, outlining its purpose and key features relevant to research administrators.
Prompt: What is the Davis-Bacon Act?
- Response: The chatbot summarized the Davis-Bacon Act, emphasizing its implications for federally funded construction projects and how it relates to research administration.

Award Management Queries

Prompt: What is the award start date for "Meissa RSV Vaccine MV-006"?
- Response: ORAgpt retrieved the start date from the provided database and presented it accurately.
Prompt: What is the Award PI ID for "Meissa RSV Vaccine MV-006"?
- Response: The chatbot provided the Principal Investigator ID, ensuring consistency with Emory’s research administration records.
Prompt: In the Sample FORT file, tell me about the award Emory-CHOA Clinical Immunization.
- Response: The model described the award details, including funding amounts, key milestones, and reporting requirements.

Blog Viewer

Advancing Research Administration with AI: A Case Study from Emory University

By SRAI JRA posted 05-22-2025 05:46 PM

Volume LVI, Number 2

Advancing Research Administration with AI: A Case Study from Emory University

Abstract

Introduction

Background

Use Cases Driving the Project

Student Collaboration

Collaborative Development

Constraints

Theoretical Framework

Methodology

Requirements

Payment Structure

Pay-as-you-Go Approach

Billing Details

Qualitative Approach to Model Evaluation

Knowledge Data Volume and Model Response Speeds (Not Originally Tested)

Assessing the Generative AI Use Case in ORA

Selection and Curation of ORA Documentation

Project Foundation and Charter

Project Team

Project Management Implementation Plan

Plan Development

Unknown Action Items: Iterative Development & Agile Adjustments

Managing Critical Path & Execution Strategy

Tracking Progress & Adapting the Plan

Project Communication

Scope & Timeline Management

Architecture Design and Development Details

Figure 1. PoC Azure Architecture: Workflow from data collection to end-user interface, including Azure Synapse Analytics (for future big data integration), Azure Storage, Azure Cognitive Search, and the OpenAI GPT-3.5 engine.

Model Functionality: GPT Model Customization and Use of Private Data

Figure 2. Configuring the Model and Bot’s Behavior: Developer playground in Microsoft Azure’s OpenAI Studio.

Risk Management and Scope of Source Data

Scope of Source Data

Risk Management Strategies

Responsible AI: Addressing Compliance, Bias, Transparency, and Accountability

Late Adoption and Strategic Positioning

Future Planning and Impact Assessment

Cost Management and Budget Considerations

Specific Cost Management Considerations for Model Services

Budget Constraints and Resource Optimization

Results

Challenges and Obstacles

Discussion

Lessons Learned

Improvements and Recommendations

Emerging Trends and Future Directions

Acknowledgments

Corresponding Author:

References

Supplementary Content: List of Sample Prompts and Model Responses

Training FAQs

Role, Responsibilities & Task-Related Queries

Internal Guidance

External Guidance

Award Management Queries