dslreports logo
site
 
    All Forums Hot Topics Gallery
spc

spacer




how-to block ads


Search Topic:
uniqs
363
share rss forum feed


koitsu
Premium,MVM
join:2002-07-16
Mountain View, CA
kudos:23

1 edit

1 recommendation

Hmm...

Possible root causes that I have seen within companies (corporations) and in the minds of some engineers:

Someone's calculations being wrong when polling data from network devices. Most network devices count the number of octets sent/received across an interface. 1 octet = 1 byte, and 1000 bytes = 1 kilobyte (keep reading).

When it comes to network traffic, you're supposed to calculate things based on a fairly obvious calculation formula which a lot of people don't use. Instead, they try to do things like go off of the number of kilobytes (which means you've lost granularity). Note in the reference material how all the calculation methods involve multiplying by 8. As I said above: 8 bits to a byte.

This brings us next to the whole Kilobyte vs. kibibyte thing. God I hate this. It didn't used to be this nonsensical. As a computer programmer, a kilobyte, to me, has always been equal to 1024 bytes (2^10). That's how computers calculate data on a bit level: 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024. xkcd took a jab at this too.

And I say all this knowing quite well that the prefix "kilo-" has always meant 1000 -- it's just that the term when used in computer usage has always referred to numbers with a base of 2. Anyway, purists complained/bitched that "kilobyte" should actually refer to 1000 bytes, not 1024 bytes like it had up until that point. Hard disk manufacturers I'm certain had a role in this too, since they began claiming (on packaging, manuals, etc.) that 1000 bits = 1 byte (when in actuality on the hardware itself it doesn't work that way -- this was done purely from a marketing perspective to allow hard disks to look like they have more capacity than they really do (2.4% more, in fact). As such, the term "kibi-" was created to refer to 1024.

And let's not forget this annoyance too, which is common for folks unfamiliar with network devices or telecommunications (speaking in general terms here).

Anyway, the question then becomes: when converting into a unit like kilobytes, do you go with dividing by 1024, or do you go with dividing by 1000? When it comes to network devices, you're supposed to use 1000, and you're always supposed to measure things in bits. When I say "measure things in bits", I'm referring to the fact that all calculations should be doing things in bits, and save the large-unit-conversion for the very end.

The difference, when given large amounts of data, can be pretty substantial. Some real examples. Note that I'll either round down or up based on if the fraction is >=0.5 or not (duh).

193859387214 bits = 24232423402 bytes

Now let's apply the stupid kibi vs. kilo ordeal:

193859387214 bits = 189,315,808 kilobits (1000)
193859387214 bits = 193,859,387 kibibits (1024)

193859387214 bits = 24,232,423 kilobytes (1000)
193859387214 bits = 23,664,476 kibibytes (1024)

193859387214 bits = 193,859 megabits (1000)
193859387214 bits = 184,879 mibibits (1024)

193859387214 bits = 24,232 megabytes (1000)
193859387214 bits = 23,110 mibibytes (1024)

193859387214 bits = 194 gigabits (1000)
193859387214 bits = 181 gibibits (1024)

193859387214 bits = 24 gigabytes (1000)
193859387214 bits = 23 gibibytes (1024)

Technically the difference between the two (1000 vs. 1024) is 2.4%, and the reporter says he's seen differing amounts of up to 20-30%, so maybe I'm barking up the wrong tree.

Maybe someone is doing something stupid like trying to calculate a volumetric total from bits-per-second, which is incorrect -- the latter will result in an average, while the former will result in an aggregate total. If they're doing this, shame on them. I have seen people do this before, and I have also seen open-source projects screw this up too, so it's not limited to just big corporations.

And finally, god forbid if they're using something like RRDTool to store the acquired data, in which case this would actually work in the customer's favour, since RRDTool averages all the data (every row in its database) every time a new row/data point is inserted. (Yes, there are ways to turn this off (use LAST instead of AVERAGE) but even that has had some bugs in the past if I remember correctly).

This whole thing reminds me of the Verizon billing fiasco, where morons (even managers) couldn't understand the difference between 0.002 dollars and 0.002 cents.

There's also the possibility that the device AT&T is getting their statistics from is something that's post-encapsulation. I don't know if ATM is used or what, but that tends to add quite a bit around every single frame (think packet, just to keep it simple), so if they're not subtracting that from the usage, again, shame on them. Otherwise they need to increase their permitted monthly totals by the encapsulation percentage delta to make up for it.

I say all this as someone who partakes in the Tomato/TomatoUSB project, and should probably go look at the back-end scripts and Javascript used to calculate the aggregate total of network traffic per month... Let's face it: the problem could be there. I'm trying very hard not to apply Occam's razor to this...
--
Making life hard for others since 1977.
I speak for myself and not my employer/affiliates of my employer.

Kearnstd
Space Elf
Premium
join:2002-01-22
Mullica Hill, NJ
kudos:1
I do have to wonder, why did computer science go with divisions of 8?

as in why was a byte not engineered as 10bits.

figuring at some point in the design of electronic computers somebody had to decide to use base 8 instead of base 10 which would have allowed computer data to fall in line with the metric system.
--
[65 Arcanist]Filan(High Elf) Zone: Broadband Reports


TomS_
Git-r-done
Premium,MVM
join:2002-07-19
London, UK
kudos:5
reply to koitsu
said by koitsu:

1 octet = 1 kilobyte

1 octet = 8 bits.


koitsu
Premium,MVM
join:2002-07-16
Mountain View, CA
kudos:23
said by TomS_:

said by koitsu:

1 octet = 1 kilobyte

1 octet = 8 bits.

Yep sorry, typo on my part. Too many units + editing jobs going on at once. I'll fix. Thank you.
--
Making life hard for others since 1977.
I speak for myself and not my employer/affiliates of my employer.


TelecomEng

@rr.com
reply to Kearnstd
said by Kearnstd:

I do have to wonder, why did computer science go with divisions of 8?

as in why was a byte not engineered as 10bits.

Because of the binary nature of computer equipment, everything is based on powers of a bit (binary digit). 10 doesn't fall on a whole bit-length, with the closest whole bit-lengths being 3 (for decimal 7, yielding 8 zero-referenced decimal values) or 4 (for decimal 15, yielding 16 zero-referenced decimal values).

rradina

join:2000-08-08
Chesterfield, MO
reply to Kearnstd
Early engineers needed to represent the alphabet (upper/lower case), 10 digits, various common symbols (+-/"%$...) and control characters. This required 7 bits and it was the birth of the ASCII character set. An 8th bit was added for parity error correction. Anything more than this was wasteful and in those early days, core memory was ridiculously expensive and in very short supply.

Binary coded decimal (BCD) also requires multiples of four. Even though four bits can hold 16 values, only 10 of the 16 possible values is needed to represent a digit in BCD. However, since 3 bits can only represent 8 unique values, four bits with a bit of waste is necessary. (This is also called packed decimal.)


koitsu
Premium,MVM
join:2002-07-16
Mountain View, CA
kudos:23
reply to TelecomEng
I took what Kearnstd See Profile to mean why didn't we go with bit lengths that were a different size rather than 8. Meaning, what's special about the value of 8? Why must a byte range from 0-255 rather than, say, 0-63 (6-bit), 0-1023 (10-bit), or even something strange like 0-8191 (13-bit)?

There are architectures (old and present-day) which define a byte as something other than 8 bits. The examples I've seen cited are the Intel 4004 (1 byte = 4 bits), PDP-8 (1 byte = 12 bits), PDP-10 (byte length in bits was variable, from 1 to 36), and present-day DSPs (which often just use the term "word", where a single word can represent something like 60-bits).
--
Making life hard for others since 1977.
I speak for myself and not my employer/affiliates of my employer.

rradina

join:2000-08-08
Chesterfield, MO
reply to TelecomEng
I was about ready to answer that way too but then I read his posts again and thought about it more. Why couldn't a character have originally been defined as 10 bits? Perhaps it's because 10-bit boundaries would have been really wacky and inefficient in terms of an address controller?

gkloepfer
Premium
join:2012-07-21
Austin, TX
reply to Kearnstd
The most likely reason for 8 bit bytes was because a decimal digit could be represented by 4 bits (a "nybble" or "nibble" as it was called). It probably made sense to increase the data bus size in increments of 4 bits. Early processors even had special instructions to deal with decimal arithmetic as to avoid having to convert an 8-bit binary number to up to 3 groups of 4 bits (which was generally displayed using a hardware decoder onto a 7 segment display).

The first large minicomputer I used (PDP-10/PDP-20) had a 36 bit "word" (data size) which likewise gives heartburn to those who write emulators on modern hardware that is more geared toward 32 bits and multiples of that.

In any case, increasing the widths by powers of 2 have some special advantages at the machine level over other sizes, which is the most likely reason they chose 8 over 10 bits as the size of a byte.