After taking measurements of the objects, I made templates for each of them in Illustrator and began trying to assemble them in After Effects.
When I made the first templates, I was converting Feet and Inches to pixels, which didn't work out accurately because of there being 12" to 1ft instead of 10. Either way the first one wasn't too far off in terms of proportions. I managed to comp some of the rotoscope footage I've completed so far into this test to have a rough go at the beginning of the animation.
After finishing this test I decided to have a go at reworking the measurements on the templates. This time I converted them from Feet & Inches to centimeters and adjusted the templates to have 1 pixel for each centimeter, this gave more accurate proportions. Once all templates are finalized I will scale up the artwork to around about double the size needed so that when I import them into AE and scale down by half, it will create a super-sampling effect meaning minimal pixelation! The downside to this is that using such high resolution images can be very demanding in terms of computing resources but I shouldn't have any problems there.
Here's an example of the templates I created in order to make the 3D room: