Papers by Sujit Kumar Chakrabarti
Physics Simulation Based Approach to Node Clustering
Studies in computational intelligence, Oct 4, 2022
LetGrade: An Automated Grading System for Programming Assignments
Lecture Notes in Computer Science, 2022

Proceedings of the 34th International Conference on Software Engineering and Knowledge Engineering
Symbolic execution is an important program analysis technique that has found a number of applicat... more Symbolic execution is an important program analysis technique that has found a number of applications in the last fifteen years or so. Popular symbolic execution approaches use backtracking when faced with infeasibility along a path being explored. A simple backtracking strategy (i.e. backtracking by a single decision node) may suffice when the goal is to cover the entire control flow graph (CFG). However, if the goal is to cover specific parts of the CFG through a single path, simple backtracking may lead to non-optimality or even non-termination. In this paper, we present weighted backtracking strategy (WBS) that exploits previous knowledge about the program behaviour to compute 'good' candidates as destinations of backtracking. We have integrated our heuristic to SymTest, a symbolic testing framework for embedded systems. Experiments with casestudies have demonstrated that WBS improves SymTest's performance both in its ability to achieve termination as well as in computing shorter test sequences compared to the original approach. SymTest with WBS generates shorter test sequences compared to several other existing test generation approaches based on symbolic execution.

Automated Application Processing
Recruitment in large organisations often involves interviewing a large number of candidates. The ... more Recruitment in large organisations often involves interviewing a large number of candidates. The process is resource intensive and complex. Therefore, it is important to carry it out efficiently and effectively. Planning the selection process consists of several problems, each of which maps to one or the other well-known computing problem. Research that looks at each of these problems in isolation is rich and mature. However, research that takes an integrated view of the problem is not common. In this paper, we take two of the most important aspects of the application processing problem, namely review/interview panel creation and interview scheduling. We have implemented our approach as a prototype system and have used it to automatically plan the interview process of a real-life data set. Our system provides a distinctly better plan than the existing practice, which is predominantly manual. We have explored various algorithmic options and have customised them to solve these panel creation and interview scheduling problems. We have evaluated these design options experimentally on a real data set and have presented our observations. Our prototype and experimental process and results may be a very good starting point for a full-fledged development project for automating application processing process.
PePRacer (ESOP 2022 Artifact): Code and Benchmarks
We consider the problem of statically detecting data races in periodic real-time programs that us... more We consider the problem of statically detecting data races in periodic real-time programs that use locks, and run on a single processor platform. We propose a technique based on a small set of rules that exploits the priority, periodicity, locking, and timing information of tasks in the program. One of the key requirements is a response time analysis for such programs, and we propose an algorithm to compute this for the case of non-nested locks. We have implemented our analysis for real-time programs written in C in a tool called PePRacer and evaluated its performance on a small set of benchmarks from the literature.

Automated Test Case Generation for Programmable Logic Controller Code
Proceedings of the 12th Innovations on Software Engineering Conference (formerly known as India Software Engineering Conference), 2019
With the advent of programmable logic controllers (PLCs), the use of control applications has inc... more With the advent of programmable logic controllers (PLCs), the use of control applications has increased significantly in the recent years. Control system software are usually embedded real time software which interacts with other physical subsystems. The language we are dealing with is IEC 61131-3 standard, a high level domain specific programming language for developing control system software. Testing of programs developed in such language is difficult because of the occurrence of real-time interrupts, exception handlers and complex data-types. Here, a symbolic execution framework SymTest is utilized to obtain test inputs of control system software. SymTest does not ensure optimal feasible paths after backtracking. In this paper an approach using pattern matching and IDA* is utilized to overcome the drawback. The symbolic execution framework is implemented for small programs in IEC 61131-3.
Connectedness testing of RESTful web-services
Proceedings of the 3rd India software engineering conference on India software engineering conference - ISEC '10, 2010
... namely {userid} and {blogid}. A named parameter is similar to a variable in that it can be in... more ... namely {userid} and {blogid}. A named parameter is similar to a variable in that it can be instantiated to a string literal value. For in-stance, if {userid} = "abcd" and {blogid} = "xyz", the value of Blog.URI becomes "eblog/abcd/xyz". ...
Programming Languages and Systems, 2022
We consider the problem of statically detecting data races in periodic real-time programs that us... more We consider the problem of statically detecting data races in periodic real-time programs that use locks, and run on a single processor platform. We propose a technique based on a small set of rules that exploits the priority, periodicity, locking, and timing information of tasks in the program. One of the key requirements is a response time analysis for such programs, and we propose an algorithm to compute this for the case of non-nested locks. We have implemented our analysis for real-time programs written in C in a tool called PePRacer and evaluated its performance on a small set of benchmarks from the literature.
Using Explicit State Space Enumeration For Specification Based Regression Testing
As web applications are becoming more and more ubiquitous, modeling and testing web applications ... more As web applications are becoming more and more ubiquitous, modeling and testing web applications correctly is becoming necessary. In this paper, we have used a formal specification language, State chart to model the navigation behaviour aspect of web applications. This paper presents the ACT (Abstract to Concrete Tests) tool, an approach of generating concrete executable Selenium RC JUnit test cases from a formal State chart specification model. The ACT tool can generate concrete Selenium RC JUnit test cases from abstract test cases by utilizing data shared across different interactions of the web application with the web server. Throughout the paper, a case study of Learning Management System is used to illustrate our approach.

Proceedings of the 9th India Software Engineering Conference, 2016
In test case generation methods based on symbolic testing and/or model checking, the primary emph... more In test case generation methods based on symbolic testing and/or model checking, the primary emphasis is on covering code/model elements and not on optimising test sequence length. However, in certain domains, e.g. embedded systems, GUI, networking software, testing process may involve interaction with other physical subsystems, possibly remotely situated. Thus, test sequence length may have important implications on the cost of testing. In this paper, we present SymTest, a novel framework for test sequence generation for testing embedded systems. SymTest selects good control flow paths so as to generate shorter test sequences. In case of unsatisfiability, SymTest explores the neighbouring paths using backtracking and heuristics. SymTest is distinctive w.r.t. other related methods in its attempt to generate shorter test sequences while searching for feasible paths. The other novelty is that SymTest allows plugging in heuristics in a flexible way, a feature because of which we call S...
Handling Backtracking for Symbolic Testing of Embedded Software
2019 24th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA), 2019
Automated testing tools for programs written in IEC 61131-3 standard of programming languages sup... more Automated testing tools for programs written in IEC 61131-3 standard of programming languages supporting the development of PLC control software is less available. Symbolic execution is a program analysis technique that determines what inputs cause each part of a program to execute. Here, a symbolic execution framework is utilized to achieve test target coverage for control system software. The framework does not ensure optimal feasible paths after backtracking. In this paper an approach using weight calculation is utilized to overcome the drawback. The calculated weight values helps to backtrack to a node, on flipping will result in an optimal feasible path.

Automated Testing of Refreshable Braille Display
Human-Centric Computing in a Data-Driven Society, 2020
A majority of visually impaired population of India and other developing economies live in povert... more A majority of visually impaired population of India and other developing economies live in poverty. Accessibility without affordability has little meaning to this population. Assistive technology has great potential to make education accessible to this population, e.g. through refreshable Braille display devices. However, most existing solutions in this space remain out of reach for these users due to high cost. Innovation in data science and software engineering can play an important role in making assistive technological solutions affordable and accessible. In this paper, we present a machine-learning based automated testing approach that has played an important role in enabling us to design one of the most affordable refreshable Braille display devices of the world. The key component of our approach is a visual inspection module (VIM) created using Convolutional Neural Networks (CNNs). In our experiment, our model was able to detect malfunction of a Refreshable Braille display wi...
Formalizing GPU Instruction Set Architecture in Coq
GPUs are now a mainstream compute device. They are widely used to render images on medical device... more GPUs are now a mainstream compute device. They are widely used to render images on medical devices. Today, it has become impossible to imagine AI without them. To build confidence on the accuracy of rendering images and complex calculations, it is essential to consider formalizing the behaviour of GPU Instruction Set Architecture (ISA) at the assembly language level. In this paper, we present the formalization of GPU shader programs. We prove some properties of shader programs with respect to operational semantics of our formal model. We use Coq to mechanize the formalization of our model and proofs.

With the emergence of a knowledge driven economy, these are times of brisk business for instituti... more With the emergence of a knowledge driven economy, these are times of brisk business for institutions providing training to professionals. The flipside of this is that there is an ever increasing demand on them to create a growing number of new courses, and serve an ever increasing flow of customers ata faster and faster pace. In its various flavours, this problem is faced by MOOC/online course providers, training departments of large organisations, and modern universities which offer short term courses for corporate audience as well. Hence, there is a need to optimally utilise all available resources: time of trainer/trainee/organisation, infrastructure etc. while designing and conducting training courses. Current practice is to do thismanually. In this paper, we build a case for automation of this process. We present mathematical models for 3 problems in this space. Once a mathematical model of a problem is developed, it becomes possible to carry out several analyses on these probl...

The First International Conference on AI-ML-Systems, 2021
A majority of the current automated evaluation tools focus on grading a program based only on fun... more A majority of the current automated evaluation tools focus on grading a program based only on functionally testing the outputs. This approach suffers both false positives (i.e. finding errors where there are not any) and false negatives (missing out on actual errors). In this paper, we present a novel system which emulates manual evaluation of programming assignments based on the structure and not the functional output of the program using structural similarity between the given program and a reference solution. We propose an evaluation rubric for scoring structural similarity with respect to a reference solution. We present an ML based approach to map the system predicted scores to the scores computed using the rubric. Empirical evaluation of the system is done on a corpus of Python programs extracted from the popular programming platform, Hack-erRank, in combination with programming assignments submitted by students undertaking an undergraduate Python programming course. The preliminary results have been encouraging with the errors reported being as low as 12 percent with a deviation of about 3 percent, showing that the automatically generated scores are in high correlation with the instructor assigned scores. CCS CONCEPTS • Social and professional topics → Computing education; • Information systems → Near-duplicate and plagiarism detection.

Proceedings of the 13th Innovations in Software Engineering Conference on Formerly known as India Software Engineering Conference, Feb 27, 2020
Complexity of specification models of the present day have started becoming non-trivial. Hence, t... more Complexity of specification models of the present day have started becoming non-trivial. Hence, there is a need to evolve existing specification languages to support writing specifications following good coding practices such as incremental development and modularisation. Statechart is a modelling notation that has wide acceptance in the industry. To the best of our knowledge all current implementations of Statecharts have one common shortcoming: all Statechart variables are global. Global variables in a specification can lead to monolithic and fragile models which are hard to maintain and reuse. In this paper, we introduce local variables in Statecharts, motivate their use through illustrative examples, formalise their semantics, and analyse their interaction with basic Statechart features like hierarchical states, transitions and history. We have implemented this Statechart variant with local variables in a specification language called StaBL. Our case studies demonstrate significant improvement in modularity in models with local variable w.r.t those without local variables. CCS CONCEPTS • Software and its engineering → Formal language definitions; Semantics.

Context and motivation: Usage of Formal Specification languages is scarce in web application deve... more Context and motivation: Usage of Formal Specification languages is scarce in web application development as compared to safety critical/hardware systems. Question/problem: An apt formal specification language should provide the following features: Firstly, it should have well-defined semantics, so that specifications written in it can not be inherently ambiguous. Secondly, it should have tool support for automatic detection of specification bugs. Additionally, for domains like web development, it is important that specification formalisms build over familiar notations, as the benefits of learning highly mathematical notations in such domains are perceived to be low. Principal ideas/results: This work presents a State Based Language inspired by Statecharts called StaBL for specification of web applications, and how StaBL can be used for writing such specifications. We also present modifications to the language w.r.t Statechart which facilitate writing modular and scalable specificati...
Discovering multiple design approaches in programming assignment submissions
Proceedings of the 36th Annual ACM Symposium on Applied Computing
Designing Software Engineering Courses for Effective Teaching and Learning
Proceedings of the 10th Innovations in Software Engineering Conference on - ISEC '17, 2017
Uploads
Papers by Sujit Kumar Chakrabarti