Automated API Docs Generator Using Generative AI
Automated API Docs Generator Using Generative AI
Abstract— Our study provides an improvement on the creation Earlier the manual or partially automated creation and
of Application Programming Interfaces (APIs) usage maintenance made it difficult to keep up with the ongoing
documentation using the efficiency and power of Generative AI. rapid development updates. Therefore, an AI driven API
APIs play an important role in software integration and documentation generator is a better approach for creation of
software maintenance but the process of API documentation
API documentation which we suggested in the paper.
creation has been traditional and did not evolve with time, this
paper employs Generative AI to enhance the accuracy, speed, This system is based on Generative AI technology that
and scale of API documentation generation. The automated API combines machine learning and natural language processing
documentation generator is created using natural language to provide tools that are capable of handling difficult task that
processing applied through a large language model mainly require human input. This strategy might make a
(TinyPixel/Llama-2-7B-bf16-sharded model). Training data bigger impact on how API Documentation can be written.
was created by applying web scraping on various large tech The goal of this research is: (1) to evaluate the economic and
companies' documentation web pages to get a good quality and practical advantage of incorporating AI technology. and (2)
industry-standard documentation dataset. It was further to show how AI technology improves data driven operations
diversified and increased using the GPT model to handle a wide
by giving them access to the speed and efficiency in modern
range of API scenarios. The fine-tuning greatly enhanced the
TinyPixel/Llama-2-7B-bf16-sharded model's efficiency and software development. The intended goal is to guarantee
quality of output which is proven by the reduced response time comprehensive, precis and up to date API docs. Examining
and the accuracy of documentation generated. Our study's the state of API strategy documentation at the moment and
comparative study confirms the effectiveness of the approach highlighting the limitations and flaws of the methods used is
used. Our study's conclusion offers a comprehensive approach the goal of this study. To address these issues, the paper
that should improve software development processes and pave proposes generative AI, which gives readers access to a tool
the way for additional developments in API documentation. that might greatly increase the precision, speed, and
scalability of API documentation creation.1
Keywords— API Documentation, Generative AI, Fine-Tuning,
Large Language Models, Web Scraping, Natural Language
Processing. II. BACKGROUND AND EVOLUTION OF APPLICATION
I. INTRODUCTION PROGRAMMING INTERFACES
APIs (Application Programming Interfaces) allow different APIs, or application programming interfaces, are crucial in the
software components to easily communicate with each other. large field of software development because they allow the
To provide usability, integration and to maintain complex linkage and integration of different applications. To
understand the importance of APIs, one must first explore
architecture, comprehensive documentation is required.
1
API - Wikipedia
2 4
API - Wikipedia Intro to APIs: History of APIs | Postman Blog
3 5
https://www.baeldung.com/javadoc https://editor.swagger.io/
SCEECS 2024
Authorized licensed use limited to: UNIVERSIDADE DO ESTADO DO RIO DE JANEIRO. Downloaded on August 19,2025 at 02:56:08 UTC from IEEE Xplore. Restrictions apply.
better than previous models that relied on large amounts of to satisfy different needs related to software development
data. documentation.
RbG has been introduced as a documentation tool specifically Reference [7] looks on the benefits of combining more
designed for scientific and engineering software [6]. In general, more generic instructions with specialist models for
generating documentation, this program automatically natural language task processing. This study compares models
extracts mathematical formulas and decision-making logic trained on general data with those designed for specific tasks
from the code through code analysis. RbG arranges these to investigate the effects of adding extra data on performance.
documents with comments within the source code itself giving The results are quite impressive when training data is scarce.
it total control over the content that is provided. RbG’s usage The main finding of this work is that reliable, high-quality
in many different professional contexts such as reverse generalist data are essential to prevent models developed for
engineering, documenting new software projects and updating tasks from performing inadequately. This work highlights the
existing system documentation demonstrates its flexibility. difficulties and complexities of carefully integrating data to
These case studies demonstrate how RbG can be customised enhance natural language processing algorithms.
[2] News summary LLM with evolutionary Niche domain dataset, Compared with TFIDF NSG generates
generation using LLM fine-tuning. PENS, grain storage pest and TextRank algorithms accurate, reliable
summaries
[4] Adapting language GPT-3 language model Hand-curated dataset Metrics: output Metrics for PALMS
models to society with target values. adherence, toxicity, process evaluation
common word
[5] CodeBERT for CodeBERT pre-trained Provided by Husain et al. Pre-trained on large-scale Performance on code
programming and NLP model (2019) corpus, fine-tuned on documentation,
NL-PL apps retrieval
[9] Compute-optimal Chinchilla (Transformer Not explicitly mentioned Investigated optimal Compute-optimal
training of transformer language model) model size and tokens training solution
[10] BERT fine-tuning for BERT (Bidirectional IMDb, Yelp P., Yelp F., Investigated various Achieved new state-
text classification Encoder Representations) TREC, Yahoo! Answers, BERT fine-tuning of-the-art results on
AG’s News, DBPedia, methods text classification
Sogou News datasets
[13] Fine-tuning pre-trained RoBERTa LARGE, GLUE benchmark Two-stage fine-tuning Task-agnostic mask,
language models mBART LARGE approach adapter fine-tuning
Reference [8] investigates the tranGAN model about the use than other traditional fine-tuning techniques. To make a model
of Generative Adversarial Networks (GANs) for text produce targeted outputs that does not require a lot of
generation. This model combines the actor-critic technique additional training, prompt tuning allows for the customizing
with transformer architecture to address common text of a prompt vector for that task. Tasks such as question
production issues including sequence dependence and answering, natural language inference, and text categorization
exposure bias. The Penn Treebank dataset was used to assess show how well this technique works, not only making it much
tranGAN's capacity to generate grammatically sound and more robust but also allowing combining efficiently and
logical sentences. These assessments show off tranGAN's text quickly thereby enhancing its general applicability.
creation skills. The purpose of [12] is to create greater transparency in
A prompt tuning technique is presented in [11] that makes it evaluating language models. It provides a comprehensive
possible to condition language models that have already been framework called HELM that evaluates these models on
trained for specific tasks in an effective manner. The method various aspects such as toxicity, efficiency, bias, fairness,
works effectively on a variety of natural language processing robustness, accuracy, and calibration. The evaluation of 30
applications and requires less training time and parameters different popular language models on 42 scenarios makes
SCEECS 2024
Authorized licensed use limited to: UNIVERSIDADE DO ESTADO DO RIO DE JANEIRO. Downloaded on August 19,2025 at 02:56:08 UTC from IEEE Xplore. Restrictions apply.
HELM extend the scope of language model evaluation
greatly. Moreover, it outlines the trade-offs between metrics
and models while also providing a benchmark to compare
different generative AI techniques across languages. A
detailed examination of generative AI can be found in [14],
which discusses several advanced computational techniques
for creating meaningful content. Generative AI is a rapidly
expanding field which must consider the limitations as well as
opportunities therein. The methods involve presentation of
variational autoencoders (VAEs), generative adversarial
networks (GANs) and deep learning with examples on how
they are used in different types of content creation such as
writing, music and photos. This paper examines the problems
facing the development and deployment of Generative AI
systems to name a few large training data set requirements,
worries about potential biases and moral dilemmas.
SCEECS 2024
Authorized licensed use limited to: UNIVERSIDADE DO ESTADO DO RIO DE JANEIRO. Downloaded on August 19,2025 at 02:56:08 UTC from IEEE Xplore. Restrictions apply.
C. Comparative Analysis
A comparison of the fine-tuned and original performances
reveals a large difference. In terms of generation speed as
well as content correctness and relevancy, the upgraded
model performs better than the original. This illustrates the
usefulness of our approach and the significance of using a
dataset that has been specifically created for a particular task.
The results from our dataset was not only appropriate and
useful but also took less time to get generated, after the model
had a good idea of what to generate.
Figure 5 Sample Output from our API Documentation
V. RESULT ANALYSIS
Our focus was to maximize the performance of the
TinyPixel/Llama-2-7B-bf16-sharded model using a dataset
meant for API documentation to fine tune the model. Our
finetuned model outperformed the baseline model in several
categories.
A. Initial Model Performance
There were problems in the initial responses by
TinyPixel/Llama-2-7Bbf16-sharded model. To produce API Figure 7 Comparison of API Documentation Outputs Before and
documentation, it took an average of fifty seconds. After Fine-Tuning.
Furthermore, these outputs frequently fell short of the
assignment's true requirements in terms of the accuracy and VI. CONCLUSION
comprehensiveness needed to create an excellent API The process of creating and maintaining API documentation
specification. The first findings show that the model for modern applications has been made easy through this our
misinterpreted the user's request for API documentation. study, which leverages the power of Generative AI to create
B. Performance after Refinement API documentation for user which is concise and meets the
industry standards. We optimized the TinyPixel/Llama-2-
After the model was adjusted, both its output quality and 7Bbf16-sharded model to achieve significant speed and
speed significantly increased. These days, it only takes 36 quality gains. The improvements shown by the model show
seconds on average to construct an instance of API its flexibility and accuracy. This expands the potential of
documentation. This acceleration is necessary for practical machine learning techniques. Using web technologies to
applications, especially in settings where software interpret the model's output, we enhance user experience and
development proceeds rapidly. Furthermore, there was a provide a visually appealing and captivating interface.
discernible improvement in the quality of the documents Developers and other stakeholders will find it simpler to
generated. The updated model adhered to professional API understand and learn about API integration as a result. The
documentation standards and produced documentation that addition of HTML, CSS, and JavaScript has made the API
was more precise, intelligible, and appropriate for the current documentation easier to read and comprehend. Now that
scenario. documentation can be accessed more quickly and
perceptively, developers may fully understand the
possibilities of APIs. In addition to accelerating the learning
curve for engineers, this method promotes a more organized
and productive software development environment.
In the end, this study provides a strong basis for upcoming
API documentation enhancements. A new era of precise,
interactive, and user-centered documentation is being
promoted in by the combination of modern web technologies
and advanced computational methodologies. This all-
encompassing approach will revolutionize our interactions
with API documentation and propel the industry to previously
unheard-of heights of efficiency, accessibility, and usability.
VII. REFERENCES
Figure 6 Performance Comparison Before and After Model Fine- [1] González-Mora, C., Barros, C., Garrigós, I., Zubcoff, J., Lloret, E., &
Mazón, J. N. (2023). Improving open data web API documentation
Tuning through interactivity and natural language generation. Computer
Standards & Interfaces, 83, 103657.
SCEECS 2024
Authorized licensed use limited to: UNIVERSIDADE DO ESTADO DO RIO DE JANEIRO. Downloaded on August 19,2025 at 02:56:08 UTC from IEEE Xplore. Restrictions apply.
[2] Xiao, L., & Chen, X. (2023). Enhancing LLM with Evolutionary Fine
Tuning for News Summary Generation. arXiv preprint
arXiv:2307.02839.
[3] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal,
P., ... & Amodei, D. (2020). Language models are few-shot learners.
Advances in neural information processing systems, 33, 1877-1901.
[4] Solaiman, I., & Dennison, C. (2021). Process for adapting language
models to society (palms) with values-targeted datasets. Advances in
Neural Information Processing Systems, 34, 5861-5873.
[5] Feng, Z., Guo, D., Tang, D., Duan, N., Feng, X., Gong, M., ... & Zhou,
M. (2020). Codebert: A pre-trained model for programming and natural
languages. arXiv preprint arXiv:2002.08155.
[6] Moser, M., Pichler, J., Fleck, G., & Witlatschil, M. (2015, March). Rbg:
A documentation generator for scientific and engineering software. In
2015 IEEE 22nd International Conference on Software Analysis,
Evolution, and Reengineering (SANER) (pp. 464-468). IEEE.
[7] Shi, C., Su, Y., Yang, C., Yang, Y., & Cai, D. (2023). Specialist or
Generalist? Instruction Tuning for Specific NLP Tasks. arXiv preprint
arXiv:2310.15326.
[8] Zhang, C., Xiong, C., & Wang, L. (2019, August). A research on
generative adversarial networks applied to text generation. In 2019
14th International Conference on Computer Science & Education
(ICCSE) (pp. 913-917). IEEE.
[9] Hoffmann, J., Borgeaud, S., Mensch, A., Buchatskaya, E., Cai, T.,
Rutherford, E., ... & Sifre, L. (2022). Training compute-optimal large
language models. arXiv preprint arXiv:2203.15556.
[10] Sun, C., Qiu, X., Xu, Y., & Huang, X. (2019). How to fine-tune bert
for text classification?. In Chinese Computational Linguistics: 18th
China National Conference, CCL 2019, Kunming, China, October 18–
20, 2019, Proceedings 18 (pp. 194-206). Springer International
Publishing.
[11] Lester, B., Al-Rfou, R., & Constant, N. (2021). The power of scale for
parameter-efficient prompt tuning. arXiv preprint arXiv:2104.08691.
[12] Liang, P., Bommasani, R., Lee, T., Tsipras, D., Soylu, D., Yasunaga,
M., ... & Koreeda, Y. (2022). Holistic evaluation of language models.
arXiv preprint arXiv:2211.09110.
[13] Liao, B., Meng, Y., & Monz, C. (2023). Parameter-Efficient Fine-
Tuning without Introducing New Latency. arXiv preprint
arXiv:2305.16742.
[14] Feuerriegel, S., Hartmann, J., Janiesch, C., & Zschech, P. (2023).
Generative AI.
SCEECS 2024
Authorized licensed use limited to: UNIVERSIDADE DO ESTADO DO RIO DE JANEIRO. Downloaded on August 19,2025 at 02:56:08 UTC from IEEE Xplore. Restrictions apply.