FOR EDUCATORS: Measuring Impact of Capstone Courses on Students’ Employability

Love Image

By Swapnil Lokhande.
Faculty Mentor: Julia Ivy
May 14, 2020


The research involves the analysis of the impact of the Capstone program on the employability of the students or job seekers. The fundamental analysis in this research involves the identification of the keywords used by the people (authors) while discussing the benefits of capstone courses in the academic curriculum. The specified goal is to categorize the findings through the BE-EDGE concept and see how far the capstone project helps students in developing their Personal, Social, and Professional Capital.




Article selection and filtering

For the analysis of the keywords, the main source of data is the publicly available articles related to capstone. The analysis involves a dataset comprising the articles related to the Benefits of capstone. The articles chosen for this analysis are selected based on the following google-searched articles and other recommended articles: What is the capstone, Benefits of capstone projects, Importance of capstone project, Why capstone projects are important.

Only those articles are selected which are generic to capstone project benefits. This is done to avoid any biases for a particular program or degree and the findings must represent a generic result for Capstone project and not focused on a specific capstone project required in a particular degree or university program.

It was observed that the first 5 pages of the google search gave relevant articles and after that, the articles were more specific to a particular type of program or course. Thus, only the articles present on the first 5 pages of google search were selected.


Data Extraction Method

In order to analyze the articles and get insights from the content of the selected articles needs to be extracted from the articles which are present in the HTML format, stored in a simple text format (CSV or JSON), and use the text for further processing and analysis. Thus, our first approach in this project is to build an application that can be used to extract the desired content from different websites and store the content in the required format, and to accomplish this, different web scrapers are deployed to gather the data.


Web scraper for articles – Purpose and Technique

  • Designed a simple web scraper using python programming that can be used to pull the content from an article.
  • In this application, user need to pass the URL of an article which are freely available (Example: articles from The Conversation or The New York Times etc.)
  • The application uses requests and the BeautifulSoup package of python which are used to extract the HTML code from the given article and process it to pull the required content.



  • Unable to extract the content from the articles which requires a mandatory login on to the portal.
  • Example: articles available on the Northeastern Library portal can only be accessed after the login into the student’s account through myneu. The Chronicle of Higher Education also requires a login for accessing the articles.




The main goal of this research is to identify keywords that are frequently used and are highly relevant to the topics – Benefits of Capstone program. Thus, to identify such keywords a Machine Learning algorithm for Natural Language processing is used which is Tf-IDF (Term frequency and Inverse document frequency). This algorithm is generally used when processing human-readable language and is used to convert words into a numerical format where each word is represented in the form of a matrix (Gajare, n.d.)


How to calculate Tf-Idf score


TF-IDF for a word in a document is calculated by multiplying two different metrics:

  • The term frequency (Tf) of a word in a document is a raw count of instances a word appears in a document.
  • The inverse document frequency (Idf) of the word across a set of documents. This can be calculated by taking the total number of documents, dividing it by the number of documents that contain a word, and calculating the logarithm. The IDF is calculated to identify how common or rare a word is in the entire document set. The closer it is to 0, the more common a word is and more it is closer to 1 shows how rare it is.
  • Multiplying these two numbers results in the TF-IDF score of a word in a document. The higher the score, the more relevant that word is in that particular document (Stecanella, 2019).




The dashboard below illustrates the findings. The frequency of occurrence of words having a high Tf-Idf score is used to compare the words present in different categories.

The result consists of the bi-grams and tri-grams associated with the articles – Benefits of and importance of Capstone for students’ employability The keywords are ordered in the descending order of their rank and frequency. Here, rank is the Tf-Idf score which shows the importance of the word or relevance of the word in the given article. For example, a keyword “sponsor organization” will have higher score for the articles related to benefits of capstone since it is assumed that a capstone project involves a sponsor organization with which students works to accomplish the desired goal of the project as well as the organization.


Then, we targeted to find out, what exactly students associate with having the highest impact on their employability. We applied the Ivy’s BE-EDGE concept for data categorization to test personal capital, social capital, and professional capital as a measure of the capstone’s impact on graduates’ employability.


 Personal capital

EDGE Required Words Other meaning Words related to the analysis
Identity Self-esteem, individuality knowledge and skills, academic work, field of study, intellectual property
Focus Center of interest or activity Develop expertise specific to problem
Strategy Plan of action Deep understanding, geared towards working
Choice Making a decision when faced with two or more possibilities or well chosen/good fit
vision The ability to think or plan about the future Learn leadership relevant courses
goals Aim or desired result Academic work
ownership Right of possessing something Receive academic credit
Empowerment Power given to someone to do something or becoming stronger and more confident knowledge and skills gained, serve as culminating academic


Social capital

EDGE Required Words Other meaning Words related to the analysis
trust Quality of being true, reliability, reliable, shared understanding
Empathy Ability to understand the feeling and share the feelings of others Receive strong support
Relationships The state of being connected Faculty advisor supervises teams
rapport Understand each other feelings and share ideas


Professional capital

EDGE Required Words Other meaning Words related to the analysis
Justification Action of showing something reasonable Literature review, conduct research, facing host/sponsor organization, develop research plan
proof Evidence to help establish a fact Oral presentation, practical experience, provide opportunity, view exciting opportunity
Design thinking (preferably used by designers and design teams) cognitive, strategic and practical processes by which design concepts are developed by designers Critical thinking skills, real world problems, solve problems, apply skills, apply knowledge, develop expertise specific to problem, develop and use public speaking, address strategic challenges, receive objective study of critical issue




First, the study let us reveal that the dimensions of personal capital, social capital, and professional capital provide us with adequate measures for capstone’s impact.  They refer to the benefits that students associate with investments in their edge for the market.


Second, we found that the majority of the words associated with the capstone’s benefits belong to Personal and Professional capital, however, only two words belong to Social capital and even the frequency of the words is very low. The findings let us assume that the team nature of capstone courses while helping students to Elucidate their professional CORE (personal capital) and justify that they are ready to Generate VALUE (professional capital) for the company-client.


Interestingly, however, we found a minor effect of capstone courses on students’ Developed TRUST (social capital).  This lets us assume that the group nature of capstone projects has a mediating role in setting the capstone as a major stepping stone project for the graduating students. While the group capstone projects still helped graduates to clarify the direction for their professional journey and to prove their professional readiness, they did not around personal contacts with the industry flourish, which limited its impact on students’ employability.


Note: The analysis is done on a sample of data and the results may vary if more articles are collected for analysis. Further research and analysis can be performed using other Machine Learning algorithms for Natural Language Processing to find more accurate results between the words and collaborate and classify words based on their relationship with other words. This can be done to further expand this research in future.




Gajare, S. (n.d.). Tf-Idf for Bi-grams and Tri-grams. Retrieved from

Stecanella, B. (2019, May 10). What is Tf-Idf? Retrieved from


About the author: 

Swapnil Lokhande, Data Science and Advanced analytics practitioner and researcher.

A graduate student from Northeastern University, Boston, accomplished by Masters in Analytics. Looking for opportunities to apply my skills of analytics and computer science in delivering business solutions and simultaneously apply my knowledge in research projects. Swapnil has an extensive educational background and multi-dimensional industrial experience which makes me a passionate learner and a problem solver. In the last 5 years of my career, I have learned and developed expertise in delivering data-driven solutions to analyze business trends through statistical and predictive models and effectively communicate the findings and statistical results to technical and non-technical teams using interactive dashboards.

Outside of work, I am passionate about mentoring and teaching emerging engineering students. I have also worked as Assistant Professor in a public university in India and always try to connect with my students to help them learn programming languages using real-world problems, mentor them in their academic projects, and help them make informed decisions in their career path.

Linkedin profile:


Julia Ivy, PhD Psych, PhD Mgmt, is a Strategy and International Business Executive Professor and faculty director at Northeastern University.

BE-EDGE.comHer area of expertise is in bridging strategy and psychology in the concept of personal strategy, and synchronize personal strategy of job candidates with a competitive strategy of potential employers. In addition to her academic work, she acts as an executive coach for those facing the “What’s next?” challenge.

Her new book is Crafting Your Edge for Today’s Job Market: Using the BE-EDGE Method for Consulting Cases and Capstone Projects (Emerald Publishing, Oct. 7, 2019).

Learn more at