Security Now 254
Episode 254 |
Topic: What We'll Do For Speed Recorded: June 23, 2010 Published: June 24, 2010 Duration: 1:27:35 |
Contents
Security Now 254: What We'll Do For Speed
News & Errata
9:40 - 13:29
- The latest Apple Mac OS X update installed an OLDER vulnerable version of flash on peoples computers even if they had already updated the flash player
13:30 - 24:50
- Attorney Generals for 30 states have expressed interest in investigating Google for collecting data from unsecured wifi hotspots
- The French data protection agency (CNIL) have said that they have found medical and banking information in the data collected by Google
Spinrite Story
24:51 - 25:35 John Lovell (UK)
Spinrite fixed a dead computer
What We'll Do For Speed
29.30 - 01:25:16
29:30 - 01:02:56
- On the PDP 8 you would fetch the instruction from main memory to the instruction register
- It would then look at the opcode to see what to do
- Then it would execute the instruction
- The engineers realised that whilst they were doing this main memory wasn't being used
- On a car assembly line you do a little bit at each stage of the assembly line
- Once the car assembly line is full of partly assembled cars, cars are finished faster
- This is called a pipeline in processor technology
- Every processor made in the last 2 decades are pipelined
- Most code is sequential
- To get more performance out of a system you look at the various components of the system and try to keep all the parts busy
- Whilst you are processing a instruction main memory isn't being used
- So whilst a instruction is being processed the next instruction is fetched from memory so when the previous instruction moves past being processed the next one is there waiting
- And repeat
- Instructions may interact with each other
- E.g. if you add two registers and the next instruction took the value off the add and stored it
- This creates a problem as the instructions in the pipeline interact with each other
- Overtime the pipe lines have been made longer to increase performance
- So the instructions are broken into smaller pieces called 'micro ops'
- E.g. The instruction "push a register" can be broken down into these two steps:
- 1) Decrement stack pointer
- 2) Copy the register to where stack pointer is pointing
- The processors of today can realise that they can be doing other things for later instructions that have NOTHING to do with the current instruction with idle resources
- E.g. If the computer is out fetching data from a register the arithmetic logic unit is idle and so if a later instruction involves adding it can be moved to the front of the queue and be executed at that same time
- This is called out of order execution
- Processors went 'Super Scalar'
- This means computers can execute more than one instruction per cycle
- Some processors have up to 4 adders so they can do 4 additions at the same time
- 'Retiring a instruction' means that you write the result of a instruction back out
- There are temporary scratch pad registers that are not visible to the outside world
- When you retire a instruction you write it out to programmer accesible registers
- The engineers realised that sometimes they had a result which a later instruction was waiting for even though they hadn't retired it yet
- So logic was implemented to look at the pipeline for results which had not yet been retired but were needed for a operation in process
- This is called 'Result Forwarding'
- With branching you don't know if you are going to keep going or go somewhere else until later in a instructions phasing
- If you do branch away then everything behind that instruction is scrapped and the entire pipeline is dumped
- You then stall until you can load a series of instructions from the new location
- Engineers realised that branch prediction is crucial
- This is to predict what a branch is going to do before it does it
- One technique is that if you have encountered that branch before than you assume the outcome will be the same as last time
- There are 1 bit tables recording the outcomes of previous branches
- Generally branches backwards are taken more often than branches forward
- It is also common for branches to do the opposite of what they did last time
- A 'Saturating Counter' is a 2 bit table
- If you take a certain branch you increment the counter but it cant get past 11
- If you didn't take the branch you decrement it down to 00
- This gives you a better way to predict the outcome of a branch
- There was still the problem of patterns of branching that this counter couldn't predict
- There are shift registers connected to 2 bit tables that enables branch pattern recognition
- These branch predictors are 93% accurate
01:08:10 - 01:25:16
- When they suck in a return instruction to the top of the pipeline they have a problem
- Subroutines push register values onto the stack so that they can be popped off and restored later to prevent messing up the code that called the subroutine
- The problem is that a return instruction brings everything to a halt as you don't know where the stack pointer will be as the instructions before the return are changing the value of the stack pointer
- There is a nesting that is almost always followed however
- So the execution unit in the processor maintains its own call stack
- When it knows it has been called by a sub routine it records internally the address of the instruction after that call on its own private stack which ranges from 4 - 32 entries
- When a return instruction is seen the system pulls off its own internal stack to where it knows it will return to
- Even with all this technology there were still parts of the processor that sat idle a lot of the time
- So "Simultaneous thread execution" (Called Hyperthreading by Intel) was born
- This is the recognition that there is register pressure , there is not enough freedom of value assignment amongst registers
- But if we had a whole second set of registers then where some microinstructions were fighting with each other due to interdependencies
- Then we can have a new physical thread using another set of registers that are logically disconnected so no conflicts are possible
- So hyperthreading literally pours instructions from two different threads of execution into the same pipleline
- RISC architecture is different in a number of ways
- It was designed to prevent this kind of technology
- RISC instructions have a 'conditional instruction' and 'explicit condition code update'
- You have a stream of instructions that are being executed and a branch instruction skips over some
- If you widen the instruction word you can make conditional instructions
- These allow you to add additional bits to any instruction that say execute this unless the conditional code is 0
- You can have a group of instructions you may or may not want to execute
- So they added the ability for the instruction not to modify the condition code
Sponsors
Carbonite
- Carbonite.com
- Offer Code: securitynow
- Carb #3
- Ad Times: 00:45-00:59 and 04:45-09:08
Go To Meeting
- GoToMeeting.com/SecurityNow
- GoToMeeting #8
- Ad Times: 01:02 - 01:14 and 26:30 - 29:28
Squarespace
- Squarespace.com/securitynow
- Ad Times: 01:18-01:27 and 1:03:15 - 1:06:41
Production Information
- Edited by: djc
- Notes:
![]() |
This area is for use by TWiT staff only. Please do not add or edit any content within this section. |