Security Now 254

From The Official TWiT Wiki
Jump to: navigation, search
Security Now
Episode 254

Security Now 254: What We'll Do For Speed

News & Errata

9:40 - 13:29

  • The latest Apple Mac OS X update installed an OLDER vulnerable version of flash on peoples computers even if they had already updated the flash player

13:30 - 24:50

  • Attorney Generals for 30 states have expressed interest in investigating Google for collecting data from unsecured wifi hotspots
  • The French data protection agency (CNIL) have said that they have found medical and banking information in the data collected by Google

Spinrite Story

24:51 - 25:35 John Lovell (UK)

Spinrite fixed a dead computer

What We'll Do For Speed

29.30 - 01:25:16

29:30 - 01:02:56

  • On the PDP 8 you would fetch the instruction from main memory to the instruction register
  • It would then look at the opcode to see what to do
  • Then it would execute the instruction


  • The engineers realised that whilst they were doing this main memory wasn't being used
  • On a car assembly line you do a little bit at each stage of the assembly line
  • Once the car assembly line is full of partly assembled cars, cars are finished faster
  • This is called a pipeline in processor technology
  • Every processor made in the last 2 decades are pipelined


  • Most code is sequential
  • To get more performance out of a system you look at the various components of the system and try to keep all the parts busy


  • Whilst you are processing a instruction main memory isn't being used
  • So whilst a instruction is being processed the next instruction is fetched from memory so when the previous instruction moves past being processed the next one is there waiting
  • And repeat


  • Instructions may interact with each other
  • E.g. if you add two registers and the next instruction took the value off the add and stored it
  • This creates a problem as the instructions in the pipeline interact with each other
  • Overtime the pipe lines have been made longer to increase performance
  • So the instructions are broken into smaller pieces called 'micro ops'
  • E.g. The instruction "push a register" can be broken down into these two steps:
  • 1) Decrement stack pointer
  • 2) Copy the register to where stack pointer is pointing


  • The processors of today can realise that they can be doing other things for later instructions that have NOTHING to do with the current instruction with idle resources
  • E.g. If the computer is out fetching data from a register the arithmetic logic unit is idle and so if a later instruction involves adding it can be moved to the front of the queue and be executed at that same time
  • This is called out of order execution
  • Processors went 'Super Scalar'
  • This means computers can execute more than one instruction per cycle
  • Some processors have up to 4 adders so they can do 4 additions at the same time


  • 'Retiring a instruction' means that you write the result of a instruction back out
  • There are temporary scratch pad registers that are not visible to the outside world
  • When you retire a instruction you write it out to programmer accesible registers
  • The engineers realised that sometimes they had a result which a later instruction was waiting for even though they hadn't retired it yet
  • So logic was implemented to look at the pipeline for results which had not yet been retired but were needed for a operation in process
  • This is called 'Result Forwarding'


  • With branching you don't know if you are going to keep going or go somewhere else until later in a instructions phasing
  • If you do branch away then everything behind that instruction is scrapped and the entire pipeline is dumped
  • You then stall until you can load a series of instructions from the new location


  • Engineers realised that branch prediction is crucial
  • This is to predict what a branch is going to do before it does it
  • One technique is that if you have encountered that branch before than you assume the outcome will be the same as last time
  • There are 1 bit tables recording the outcomes of previous branches
  • Generally branches backwards are taken more often than branches forward
  • It is also common for branches to do the opposite of what they did last time
  • A 'Saturating Counter' is a 2 bit table
  • If you take a certain branch you increment the counter but it cant get past 11
  • If you didn't take the branch you decrement it down to 00
  • This gives you a better way to predict the outcome of a branch
  • There was still the problem of patterns of branching that this counter couldn't predict
  • There are shift registers connected to 2 bit tables that enables branch pattern recognition
  • These branch predictors are 93% accurate


01:08:10 - 01:25:16

  • When they suck in a return instruction to the top of the pipeline they have a problem
  • Subroutines push register values onto the stack so that they can be popped off and restored later to prevent messing up the code that called the subroutine
  • The problem is that a return instruction brings everything to a halt as you don't know where the stack pointer will be as the instructions before the return are changing the value of the stack pointer
  • There is a nesting that is almost always followed however
  • So the execution unit in the processor maintains its own call stack
  • When it knows it has been called by a sub routine it records internally the address of the instruction after that call on its own private stack which ranges from 4 - 32 entries
  • When a return instruction is seen the system pulls off its own internal stack to where it knows it will return to


  • Even with all this technology there were still parts of the processor that sat idle a lot of the time
  • So "Simultaneous thread execution" (Called Hyperthreading by Intel) was born
  • This is the recognition that there is register pressure , there is not enough freedom of value assignment amongst registers
  • But if we had a whole second set of registers then where some microinstructions were fighting with each other due to interdependencies
  • Then we can have a new physical thread using another set of registers that are logically disconnected so no conflicts are possible
  • So hyperthreading literally pours instructions from two different threads of execution into the same pipleline


  • RISC architecture is different in a number of ways
  • It was designed to prevent this kind of technology
  • RISC instructions have a 'conditional instruction' and 'explicit condition code update'
  • You have a stream of instructions that are being executed and a branch instruction skips over some
  • If you widen the instruction word you can make conditional instructions
  • These allow you to add additional bits to any instruction that say execute this unless the conditional code is 0


  • You can have a group of instructions you may or may not want to execute
  • So they added the ability for the instruction not to modify the condition code


Sponsors

Carbonite

  • Carbonite.com
  • Offer Code: securitynow
  • Carb #3
  • Ad Times: 00:45-00:59 and 04:45-09:08

Go To Meeting

Squarespace

Production Information

  • Edited by: djc
  • Notes:
Info.png This area is for use by TWiT staff only. Please do not add or edit any content within this section.