On Killing WIN32?
Many years ago I used to be a dedicated reader of Ars, but it slowly transitioned to something a little too biased for my taste, so I avoid it, but thanks to twitter, it is possible to get sucked into...
View ArticleVulkan - How to Deal With the Layouts of Presentable Images
Continuing my posts on building a Vulkan based Compute based Graphics engine from scratch (no headers, no libraries, no debug tools, no using existing working code)...Interesting to Google something...
View ArticleBlink Mechanic for Fast View Switch for VR
As seen in the SIGGRAPH Realtime section for Bound for PS4 PSVR around 42 minutes in this video. Great to see someone making good use of the "blink mechanic" to quickly switch view in VR. Scene quickly...
View ArticleVulkan From Scratch Part 2
Continuing posting when I find some time to work on the "from-scratch" Vulkan engine...Review From Last TimeBringing up on Windows first this time, will get to Linux later. Got basic system interface...
View ArticleUber Shader Unrolling
Looking at running a compute only pass, no graphics waves to contend with on the machine, so it becomes relatively easy to think about occupancy. Target 4 waves per SIMD via a 4 wave work-group (one...
View ArticleGPU Parking Lot
Push ModelPerhaps the GPU parking lot, aka register file waiting on long latency returns, is a side effect of not having ability to issue a load which pushes data to a different SIMD unit's register...
View ArticleTransistor Count Thoughts
Wikipedia's Transistor Count PageReally interesting page on Wikipedia. Amazing how many of the original cache-free Acorn RISC Machines will fit in the transistor budget of modern processors. The rest...
View ArticleThinking "Clearly" About 4K
The real advantage of this console "upgrade cycle" is that now a developer should be able to produce a good 1080p @ 60 Hz game without loosing pixel quality. However what is likely going to happen...
View ArticleThe Great MacOS 9
This "An OS 9 odyssey: Why these Mac users won’t abandon 16-year-old software." is an awesome article. OS 9 was the peak of Apple operating systems. Low latency, instant response. If only the industry...
View ArticleParallel Noise Generation
Re a related Twitter Post ... Concerning making tile-able textures for grain or noise, and getting various desired properties. My preference is towards algorithms which parallelize trivially. The...
View ArticleT4K
Tries : 321This post is just me using blogger an as active notepad to paper design a FPGA soft-core for a many-core machine. Hopefully I'll update it once an a while. The aim is to try to see what can...
View ArticleTK4 - Try 2
Tries : 321Update Log2016/10/03 : Initial posting. Most of pipelining figured out. Working through DSP input and operation details. Have to think through data and return stack usage cases, decide if...
View ArticleTK4 - Try 3
Tries : 321Update Log2016/10/04 : Trying dropping data stack (won't fit, gets expensive with adders if doing indexed access). Have something figured out for DSP inputs. Rethinking ISA, have place...
View ArticleEpiphany-V Taped Out
Epiphany V Tape Out PageEpiphany-V: A 1024 processor 64-bit RISC System-On-Chip PDFLooks like it does 2 32-bit d = a * b + c floating point operations, or 4096 flops/clock. If they hit 1 GHz they would...
View ArticleForth Hardware Thoughts
James Bowman's FPGA based J1 : Site | PDF | Presentation | Forth Source Chuck Moore : Arithmetic | Instruction Set | Ether Forth | Problem Oriented LanguageGA144GreenArrays144 cores9216 18-bit words of...
View ArticleTechnical Evaluation of Traditional vs New "HDR" Encoding Crossed With...
Conclusion first. There is no technical justification for using the new HDR signal standards for HDR. Classic 8-bit/channel or 10-bit/channel Gamma 2.2 output is more than adequate for the full...
View ArticlePossible Directional Routing Hoplite Variant?
Thinking about minimal grid based routing. Two things I don't like about the Hoplite, (1.) Full chip return paths.(2.) Route length not proportional to 2D locality. Like the simplified router and...
View ArticleAtomic Scatter-Only Gather-Free Machines
GPUs are build around having texture caches, and caches are build around collecting loads for the most part, because loads have the highest memory traffic typically. So if after stripping out the...
View ArticleInstruction Fetch Optimization
Been reading the Artix-7 FPGAs Data Sheet: DC and AC Switching Characteristics to better understand requirements for higher clock FPGA design. Seems as if the only point in using BRAM output register...
View ArticleNotes from Attempting to Understand FPGA Timing Limits
I'm using the table I built below as a quick reference to think though timing while working on design. Reference from last time, Artix-7 FPGAs Data Sheet: DC and AC Switching Characteristics. Working...
View ArticleDSP and Rounding Notes
Posting a few more notes while in the background I continue to work towards the next design try.===================== DSP FUNCTIONALITY=====================Some of the Xilinx 7 series DSP functionality...
View ArticleVariation on Branching Design - Return Only
Thoughts related to Instruction Fetch Optimization, a post which talked about only auto-incrementing the lower 8-bits of the program counter, having even-only branch addresses to remove an ADDer...
View ArticleSimplified Vulkan Rapid Prototyping
Nothing simple about using Vulkan, so this title is a little misleading ...Trying something new for my next Vulkan based at-home prototyping effort and building from scratch for 64-bit machines only....
View ArticleLast Blogger Post
Blog is actively slowly migrating to a Git Wiki,https://github.com/TimothyLottes/TimothyLottesWiki/wiki
View Article