Jump to content

Recommended Posts

I have a dataset of student essays - 1700 essays in 20 excel files, which I've imported into NVivo 12 Plus.  The database is structured with two files per week and one row per student for each week.  There is a set of identifiers (student name, date, etc), with the student ID being the unique identifier in each file (and common across files).  There is a node for each weekly essay, with each essay imported as a single excel cell.

I want to content analyse the essays for the topics the students talk about.  I'm not getting the results I want with machine coding, so I'm going to hand code two of the files (170 essays).

Here's the problem:

- On one hand I want to analyse the relationship between the codes.  (student x talked about topic 1 and topic 2).  That suggests attaching the code to the entire cell.

- On the other hand, I want to reuse the codes in the future on other files via machine coding.  That suggests making the codes as precise as possible. 

The question is, do I attach the codes to individual sentence fragments, or do I attach them to the entire excel cell?  If I attach them to sentence fragments, is there a way of analysing them as if they were attached to the entire cell? 


Share this post

Link to post
Share on other sites

Hi Peter,

Coding the entire row will be fine because this will code all the codeable cells in a row at once. Note that this will create number of references based on how many codeable cells are there in the row. You can also code entire columns to relevant topic.

Please refer to the link below to understand how you can use automatic coding for above purpose:


You can then run a Matrix Query between your Cases and Topic to find out what a student said about a particular topic.

Manually coding each cell would definitely be more precise but it will involve a lot of work.

Kind Regards,


Share this post

Link to post
Share on other sites

Hi Bupesh,

I'm not sure you've fully understood what I'm trying to do, so maybe I can try again.  I have about 1700 essays to code each semester, and want to repeat the exercise in subsequent semesters. 

1) Consequently, when it comes to CODING my data, I would like to be able to "Autocode my data based on existing coding patterns".  From my understanding of how the Autocode functionality works, I will get a better result if I code as precisely as possible.    

2) However, when it comes to ANALYSING my data, I would like to be able to perform matrix analyses of my data.  (For example, I might like to know which respondents talk about teamwork AND a growth mindset.).  From my understanding of how the analysis engine works, I can only do this if the entire essay (i.e. the entire cell in the excel spreadsheet) is coded at both nodes.  

Clearly, these two imperatives are in conflict.  Is there some way around it, or do I need to code my data twice, into two separate folders, once for later autocoding and once for analysis?  

Share this post

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now