Gameboy Development Forum

Discussion about software development for the old-school Gameboys, ranging from the "Gray brick" to Gameboy Color
(Launched in 2008)

You are not logged in.

Ads

#1 2019-07-13 00:30:06

WeaselBomb
Member
Registered: 2018-03-06
Posts: 20
Website

MBC5 in WinCUPL (problem)

DISCLAIMER: I am new to programmable logic and may ask some stupid questions. I apologize in advance. I have no formal training/education in this field.

*EDIT: I may or may not still use WinCUPL, I've started playing around with Xilinx ISE Webpack and have been impressed so far.*

My last project was so much fun, I've finally decided I'd like to start another. What I've got is a flash cartridge with 2 MB flash, 32 KB FRAM, a genuine MBC3 chip, and solderless battery swap via a coin cell retainer. For the next step, I would like to expand on that by including ALL the bells and whistles: 8 MB flash, 128 KB FRAM, RTC, and a CPLD to replace the MBC, coded in WinCUPL or VHDL.

The logic part will be easy enough to figure out, given all the documentation that's out there. The real problem is powering the circuit. All I know is that I can't use 5 volts throughout, because the parts just aren't available. Also, to power a CPLD from a coin battery (for the RTC) requires them to use less than 3v, I think? And realistically, it would probably be wise to shoot lower than that, so the battery can discharge even longer before dropping below the required supply voltage. So we may be looking at using a 1.8v CPLD, which will likely require its own regulator...geez.

Last edited by WeaselBomb (2019-07-27 13:24:51)

Offline

 

#2 2019-07-24 22:53:47

WeaselBomb
Member
Registered: 2018-03-06
Posts: 20
Website

Re: MBC5 in WinCUPL (problem)

Got tired of digging through datasheets so started wiring up a circuit. Had some 2 MB flash laying around, so I wired it up without an MBC (as a first step) and verified that it could read/write Tetris to/from the flash. Next will add in the only CPLD I currently have on hand, an Atmel ATF1504ASL. For the I/O list, I figure I will need the following:
-4 Address lines (A15..A12)
-8 Data lines (D7..D0)
-6 control signals (WR, RD, CS, RAMCS, ROM WE, RESET)
-10 bank select lines (HiAddress22..HiAddress13, RAM and ROM can share 14-16)

That's 28 I/O lines, which should leave me 4 I/O still open. That's good, because I can still add RTC, and drive level shifters' OE if I go with a 1.8v CPLD.

I think my next goals are:
-write the MBC1-5 code in WinCUPL/VHDL/Verilog
-start picking out parts for the cartridge I want to make

https://i.imgur.com/eBiZzZIm.jpg?1

Last edited by WeaselBomb (2019-07-24 23:12:35)

Offline

 

#3 2019-07-27 13:14:33

WeaselBomb
Member
Registered: 2018-03-06
Posts: 20
Website

Re: MBC5 in WinCUPL (problem)

Well, the code required to make 32 KB roms work was easy enough. I thought something was wrong with my ROM chip, but then I remembered that I never put in a pull-up resistor on #WE...whoops. For now, I'm toggling #WE using the cart's Audio pin, and I'm thinking that's probably a better idea than generating a #WE signal from the CPLD.

/***************INPUTS***************/
pin 3 = A14;
pin 39 = RESET;

/***************OUTPUTS***************/
pin 23 = HiRom0;    /*RA14*/


/**************EQUATIONS**************/
HiRom0 = A14 & RESET;

Well, I'm satisfied with the setup so far, so next I guess it's time to try actual rom banking.

Offline

 

#4 2019-07-27 15:33:07

WeaselBomb
Member
Registered: 2018-03-06
Posts: 20
Website

Re: MBC5 in WinCUPL (problem)

Well, I expanded the code a bit to try some rom banking, and it kind of works. I wanted to start small, so I decided to try Super Mario Land, which is MBC1 + 64 KB (no ram). The flasher was able to read/write it with no errors. When I try to read it on my Gameboy, however, it freezes after the boot screen disappears. It makes me wonder if I'm missing something that the MBC1 needs...will have to experiment some more.

*EDIT: changed bit 0 of the rom bank register to be preset to 1 after RESET is asserted. This allows roms that have no MBC to still be able to read 0x4000 to 0x7FFF.*

/***************INPUTS***************/
pin 2 = A15;
pin 3 = A14;
pin 5 = A13;

pin 8 = D0;
pin 10 = D1;

pin 39 = RESET;
pin 44 = WR;

/***************OUTPUTS***************/
pin 23 = HiRom0;    /*RA14*/
pin 25 = HiRom1;    /*RA15*/


/**************VARIABLES**************/

/*Rom Bank Register*/
node RomBank0;
node RomBank1;
RomBankSwitch = !A15 & !A14 & A13 & !WR;

RomBank0.L = D0 # !D1;
RomBank1.L = D1;

RomBank0.LE = RomBankSwitch;
RomBank1.LE = RomBankSwitch;

RomBank0.AP = !RESET;
RomBank1.AR = !RESET;

/**************EQUATIONS**************/
HiRomAccess = A14 & RESET;

HiRom1 = RomBank1 & HiRomAccess;
HiRom0 = RomBank0 & HiRomAccess;

Last edited by WeaselBomb (2019-07-28 01:23:45)

Offline

 

#5 2019-08-04 18:38:45

Tauwasser
Member
Registered: 2010-10-23
Posts: 88

Re: MBC5 in WinCUPL (problem)

Hi WeaselBomb,

I think it's cool that you document your exploits on here smile

One thing that immediately springs to mind is the fact that you only look at D1 and D0 to do the zero-adjustment logic.
By your logic, selecting ROM bank 1 will actually select ROM bank 0, while selecting ROM bank 1 will result in ROM bank 0.
Selecting any higher ROM bank will alias in weird ways, because of this switch. The usual expectation would be for ROM bank 2 to alias back to 0 (instead of 1).

You'd basically want to store the value as-is and instead OR the HiRom0 output with the NOR of all internal banking bits.
The real MBC1 of course does zero adjust on all 5 internal bank bits instead of the two you have.

Last edited by Tauwasser (2019-08-04 18:44:42)

Offline

 

#6 2019-08-06 20:11:00

WeaselBomb
Member
Registered: 2018-03-06
Posts: 20
Website

Re: MBC5 in WinCUPL (problem)

Hey again, Tauwasser! It's been a while since I've updated this, I got bogged down in trying to work out some ideas for RTC. Turns out that feature is a lot harder to implement than I thought. This is as far as I got before I decided that Xilinx has better options regarding power consumption (I think). I was dead set on using the XC2C32A, until I realized that all the RTC registers wouldn't fit on it...in fact, I don't think the entire design could fit on anything less than the XC2C128! This results in a lot of wasted pins, extra power draw, etc.

I created a test bench in ISE using ONLY the code for the real-time clock, so I could get a ballpark estimate of how much current the 128 would draw when running on battery. It estimated around 20uA (even with the CoolClock feature) which didn't sound too bad, but I think it would drain a CR2025 in around 300-350 days? I'm a Xilinx noob, so there may be other ways to save power (besides the DataGATE) that I'm not aware of. Regardless, that CPLD takes up a TON of space AND requires its own dedicated regulator! Not exactly ideal.

Next, I thought maybe I could use an external RTC chip. I think I looked through every single datasheet on digikey and mouser.com before I gave up hope. All of them had at least one of these problems:
-data stored in wrong format (BCD instead of binary)
-day counter only goes from 1-31 (they use a month and year counter instead)
-they require a serial interface
-they use binary, but they only use a 32-bit seconds counter

So yeah, RTC is about to make me jump off of a bridge. I was tempted to finally give up on including it, because the remainder of the design is simple and works great (in theory) as it is. But this week, I've discovered the magic of microcontrollers. As badly as I wanted to do this with a CPLD, I think a MCU might be the best option for RTC. A lot of the devices I've seen have insanely low power draw, integrated RTC, and even dedicated battery inputs! All of this while using even less power than a CoolRunner-II (I think).

Code:

/***************INPUTS***************/
pin 2 = A15;
pin 3 = A14;
pin 5 = A13;
pin 6 = A12;

pin 13 = D0;
pin 14 = D1;
pin 15 = D2;
pin 18 = D3;
pin 19 = D4;
pin 20 = D5;
pin 21 = D6;
pin 22 = D7;

pin 23 = HiRom0;    /*RA14*/
pin 25 = HiRom1;    /*RA15*/
pin 27 = HiRom2;    /*RA16*/
pin 28 = HiRom3;    /*RA17*/
pin 30 = HiRom4;    /*RA18*/
pin 31 = HiRom5;    /*RA19*/
pin 33 = HiRom6;    /*RA20*/

pin 39 = RESET;
pin 44 = WR;

/**************VARIABLES**************/
pinnode 615 = RomBank0;
pinnode 613 = RomBank1;
pinnode 612 = RomBank2;
pinnode 611 = RomBank3;
pinnode 610 = RomBank4;
pinnode 609 = RomBank5;
pinnode 608 = RomBank6;
RomBankSwitch = !A15 & !A14 & A13 & !A12 & !WR;
ReadHiRom = A14 & RESET;

RomBank6.L = D6;
RomBank6.LE = RomBankSwitch;
RomBank6.AR = !RESET;

RomBank5.L = D5;
RomBank5.LE = RomBankSwitch;
RomBank5.AR = !RESET;

RomBank4.L = D4;
RomBank4.LE = RomBankSwitch;
RomBank4.AR = !RESET;

RomBank3.L = D3;
RomBank3.LE = RomBankSwitch;
RomBank3.AR = !RESET;

RomBank2.L = D2;
RomBank2.LE = RomBankSwitch;
RomBank2.AR = !RESET;

RomBank1.L = D1;
RomBank1.LE = RomBankSwitch;
RomBank1.AR = !RESET;

RomBank0.L = D0;
RomBank0.LE = RomBankSwitch;
RomBank0.AP = !RESET;

/**************EQUATIONS**************/
HiRom0 =(RomBank0 # !(RomBank1 # RomBank2 # RomBank3 # RomBank4 # RomBank5 # RomBank6)) & ReadHiRom;
HiRom1 = RomBank1 & ReadHiRom;
HiRom2 = RomBank2 & ReadHiRom;
HiRom3 = RomBank3 & ReadHiRom;
HiRom4 = RomBank4 & ReadHiRom;
HiRom5 = RomBank5 & ReadHiRom;
HiRom6 = RomBank6 & ReadHiRom;

Last edited by WeaselBomb (2019-08-06 20:22:33)

Offline

 

#7 2019-08-07 12:10:24

WeaselBomb
Member
Registered: 2018-03-06
Posts: 20
Website

Re: MBC5 in WinCUPL (problem)

Well, after some looking around, an 8 bit PIC microcontroller from Microchip sounds very promising. I haven't locked in an exact device, but a potential one is the PIC16LF191 series.

Here's a datasheet link if anyone's curious: https://www.mouser.com/datasheet/2/268/ … 116203.pdf

It's got a voltage range of 1.8 - 3.6, a dedicated battery input pin, up to 48 I/O, and more RAM than I should ever need for RTC (could also be useful for emulating the MBC2's onboard RAM). It can also be clocked up to 32 MHz.
This means that I don't need any extra diodes/logic for switching to battery backup, PLUS only one single regulator is needed to convert 5v to 3.3!
The only downsides are:
-RTC is stored in BCD, which I will have to manually convert to the correct format.
-Only the Secondary Oscillator (SOSC) is powered by the Vbatt pin. And the SOSC must be an external oscillator. But I was already planning on using an external crystal, so who cares?

I think I've got all my parts pretty much decided. Here's a rough layout:
https://i.imgur.com/Zacw0POm.png

U3,U4 = Level Shifters
U5,U6 = Flash, RAM
MCU1 = MCU
U7 = CR2025 retainer (this part is AMAZING, takes up relatively little space for the battery size)

Last edited by WeaselBomb (2019-08-08 00:03:34)

Offline

 

#8 2019-08-10 13:36:41

WeaselBomb
Member
Registered: 2018-03-06
Posts: 20
Website

Re: MBC5 in WinCUPL (problem)

One thing I'm working on is driving the DIR signal of my level shifter. I see most people use the #RD signal to drive it, but every dual-supply shifter I've looked at uses the opposite polarity of the #RD signal. I'm wondering if I could just use the #WR signal instead? Since I'm already driving the #OE signal, I would imagine this would work out fine. Could also use a single-supply shifter, but then I wouldn't be able to reach the CMOS standard output level on the 5v side. It would probably still work, but I don't wanna take chances.

Last edited by WeaselBomb (2019-08-10 13:53:25)

Offline

 

#9 2019-08-10 15:12:12

Tauwasser
Member
Registered: 2010-10-23
Posts: 88

Re: MBC5 in WinCUPL (problem)

Many people just generate an #OE signal from their logic. You need to be aware that the #RD and #WR signals are also connected to the internal GameBoy WRAM and you don't want to drive the data pins when WRAM is addressed, so you would need decoding logic anyway.

As for the MCU, you'd have to evaluate if 32 MHz is enough for your purposes. Basically, you should be fine when you just have decoding logic and your U5, U6 are the actual RAM and ROM parts. However, at 32 MHz, you only have 32 machine instructions for each 1 MHz bus cycle of the Game Boy Edge Connector. And sampling uncertainty makes that 16 machine instructions. So you would need pretty tight code to emulate MBC2 or MBC3's RTC.

Offline

 

#10 2019-08-10 17:25:30

WeaselBomb
Member
Registered: 2018-03-06
Posts: 20
Website

Re: MBC5 in WinCUPL (problem)

That sounds good. I'll generate #OE from the MCU and drive DIR with WR since RD has wrong polarity.

You're right, after double sampling, there isn't much breathing room to read/write internal registers. There's a feature in these MCUs that I was hoping could save me some machine cycles:

Code:

The Configurable Logic Cell (CLCx) module provides programmable logic that 
operates outside the speed limitations of software execution. The logic cell 
selects from 40 input signals and, through the use of configurable gates, 
reduces the inputs to four logic lines that drive one of eight selectable 
single-output logic functions.
Input sources are a combination of the following:
 • I/O pins
• Internal clocks 
• Peripherals
• Register bits

The output can be directed internally to peripherals and to an output pin.

The MCU has 4 of these cells, but I don't think that's enough to avoid decoding via machine cycles.

Last edited by WeaselBomb (2019-08-10 17:27:15)

Offline

 

#11 2019-08-12 18:24:39

WeaselBomb
Member
Registered: 2018-03-06
Posts: 20
Website

Re: MBC5 in WinCUPL (problem)

Lots of research the past couple of days, and I feel like I generate more questions than answers...

Anyway, I'm thinking maybe these CLC modules can be used to handle the bank switching?
For example, for Rom bank switching, I could set RA14-RA22 = [RomBankRegister] during the write cycle, and use the CLC to enable/disable the output during the read cycle?

Then all I have to do with my instruction cycles is listen for WR to be asserted, and use a jump table to determine which register needs to be written? Something like...

Code:

[Address Bits] | [Register]
0010              | RomBankLow
0011              | RomBankHigh
0001              | RamEnable
0110              | RTC Latch
...etc

I'm not sure if I'm making any sense, or if what I am talking about is even possible. Sorry if it doesn't make sense!
Either way, I wonder if this will need to be written in ASM to get the performance I need. I have ordered the MCU in a DIP package to start testing.

...and I actually just realized that all of my pins can generate an interrupt on either edge, so generating an interrupt on the rising edge of WR might mean I don't have to sample the address bus (because the address and data should already be on the bus at that point)? I could just wait for WR to de-assert, see what the address is, then place the incoming data in the corresponding register?

Last edited by WeaselBomb (2019-08-13 11:53:34)

Offline

 

#12 2019-08-16 20:54:07

WeaselBomb
Member
Registered: 2018-03-06
Posts: 20
Website

Re: MBC5 in WinCUPL (problem)

Well, got the MCU, the Programmable Logic Cells won't be able to do what I had originally wanted, so not sure if this will work out in the end...unfortunate, but I'm still gonna see what I can squeeze out of this thing.

Luckily, I was able to borrow a programmer/debugger from work. Turns out, they're pretty expensive.

I decided to start with the simplest test, so I wired it up and was able to flash Tetris. At this point, I'm not even using machine cycles, just one logic cell to pass through A14 to RA14. Next, I guess I'll try Super Mario Land.

Offline

 

#13 2019-08-19 16:02:29

WeaselBomb
Member
Registered: 2018-03-06
Posts: 20
Website

Re: MBC5 in WinCUPL (problem)

I really thought this MCU would perform a bit better. At 32 MHz, I thought that there would be plenty of cycles to perform everything within a Gameboy instruction cycle.

What I didn't realize at first, was that the MCU requires 4 clock cycles for each instruction, meaning it has a maximum rate of 8 million instructions per second (MIPS). The DMG has a rate of 1 MIPS. At best, that leaves 8 instructions to do everything until the next cycle.

But the Gameboy Color can double in execution speed, reaching up to 2 MIPS. That only leaves FOUR instructions in each cycle, at most!

And last, if I'm not mistaken, Pokemon Stadium on the N64 can play Gameboy games at 3 TIMES the speed, reaching 3 MIPS! At this point, we have less than 3 instructions per cycle!

With 3-4 instructions, I'm not sure if I could ever hope to pull this off, even if I coded completely in ASM and executed all code exclusively from RAM. I think just polling would eat up 2 of those instructions? And with the inherent latency of interrupts, I think I'm out of options...

So, I think it's time to pull the plug on this one. If I can't have full compatibility with all systems (N64, GB, GBC, GBA), then it's not worth moving forward with this. Unless I have an idea that works, like an FPGA + MCU, then I think this is a bust.

Last edited by WeaselBomb (2019-08-19 16:07:36)

Offline

 

Board footer

Powered by PunBB
© Copyright 2002–2005 Rickard Andersson