UKC

Excel corrupts on saving, won't reopen

New Topic
This topic has been archived, and won't accept reply postings.
 Yanchik 10 Jun 2024

OK, long shot, but... might be worth it...

"We found a problem with some content... " ... "The workbook cannot be opened or repaired by Microsoft Excel because it is corrupt." Sometimes flashes up 10-15 repair attempts before failing. 

I have no idea what Excel thinks is corrupt. There are no macros. The content is primarily one pretty big table that goes out to lookups from other tables (lots of XLOOKUP, INDEX/MATCH) to bring back values and do some mild maths. Quite a lot of error-catching, IFERROR and suchlike to limit the inevitably divisions by zero. A couple of pivot tables so I can do quick error checking but not much. 

What's weird is...

The files are, by my standards, BIG. As xlsx they are about 500Mb, as xlsb about 330Mb. The big table is about 500K lines deep (but not very wide.) This problem began to appear a few weeks back and I had a couple of workarounds. Clearing the Excel cache seemed to work but now doesn't. Working on the logic while the big table was empty and then dropping in the data, saving an xlsb, xlsx version, that worked. 

I'm in trouble now though because the ~50Mb "empty" version is throwing the same error so my workaround isn't working. 

The files can be opened on Apple Macs, and they can be read by Tableau. So as I say, I don't think they're actually corrupt. But I don't have a clue how to proceed... of course, I have a few older versions, and they open, but my whole working method is kinda unreliable now. 

- Any thoughts what might be the problem ? Functions to avoid, things to look out for ? 

- The empty version seems kinda bloated to me. There's really only a few tables in it; less than 12, none of them are deeper than 1,000 lines. Is there hidden cruft I should be trying to trim off ? 

- Best of all - how to open/recover my files and have a working method that... works... 

Yep, we're trying to move the whole thing to a Python environment sometime. Great, but I'm a subject matter person, not a coding/IT person, and I need to progress the job. 

Thanks in advance...

Y

 sandrow 10 Jun 2024
In reply to Yanchik:

You may have thousands of empty lines at the bottom of your sheets or columns to right - happens when you paste stuff in sometimes.

Go to 1st blank row at bottom of sheet then press CTRL-SHIFT-DOWN ARROW and hit delete.

Go to 1st blank column on right of sheet then press CTRL - SHIFT - RIGHT ARROW and hit delete

Then save as a new file name and see what happens.

OP Yanchik 10 Jun 2024
In reply to sandrow:

That's taken 10% off the size of the empty file... progress ! Thanks. 

Y

 wintertree 10 Jun 2024
In reply to Yanchik:

My god, what have you done?

I suggest splitting the spreadsheet up in a way that hopefully jettisons any bad state inside the data structures behind the spreadsheet that is probably causing the bug.  Wouldn’t surprise me if it’s related to saving a giant spreadsheet over and over.

1. Copy/paste the data table(s) into a clean spreadsheet ideally using “Paste as Values” to leave anything else behind.  Save as “data.xlsx” or whatever.  Or you could export it as a .csv file then import it in to a clean spreadsheet.

2. Create a second empty spreadsheet in the same folder called “logic.xlsx” or whatever.  Build your formulae here and when selecting cell ranges, switch over to the window for “data.xlsx” and select them.  Excel will reference the other spreadsheet.

Hopefully both will save fine, and you can keep saving logic.xlsx without churning the save on the big data file.  If one does fail to save, you’ve narrowed it down.

> Yep, we're trying to move the whole thing to a Python environment 

This is the way.  

> Great, but I'm a subject matter person, not a coding/IT person, and I need to progress the job. 

Well I hope my cargo cult fixes work but I think the universe is trying to tell you to learn Python!

In reply to Yanchik:

Does it open in libreoffice?

 kevin stephens 10 Jun 2024
In reply to Yanchik:

One thing that will grossly inflate Excel file sizes is range names linked to other files, these can be produced accidentally if importing data from other files. Have a look in the list (Formulas > Name Manager) and delete those that aren’t needed. Problems with largish files can also occur if Excel is set up as 32 bit instead of 64 bit, which may explain your PC vs Mac difference?

Post edited at 18:10
OP Yanchik 10 Jun 2024
In reply to Yanchik:

Ooh interesting ideas to try. Thanks folks ! I'll report back...

 tew 11 Jun 2024
In reply to Yanchik:

Save multiple copies including one as a .csv

.csv is a text file version that everything can open.

It means you have a safe back up

 Mike-W-99 11 Jun 2024
In reply to tew:

csv won't preserve much other than the data

 montyjohn 11 Jun 2024
In reply to Yanchik:

If Wintertree's idea to copy the data values into a clean spreadsheet doesn't work, is there any mileage in turning of to auto calculate option?

In the formulas tab in the ribbon, in "Calculation Options" you can select select "Manual" to stop Excel from automatically calculating formulas.

I created a quick table of random data, 3 cells wide by 500k long, and a fourth column just averaging the three values, and it's 35mb.

As a csv it's 25MB so 35mb is probably reasonable for a table of that size.

Not sure how you're getting to 500mb. I can only assume it has lot's of varying cell formatting from all the drag downs you've been doing. I'd be surprised if values only copied to a clean workbook didn't make a massive improvment. I wouldn't split the files however, I find excel to be quite clunky when referencing data from another spreadsheet so it may add additional problems.

 montyjohn 11 Jun 2024
In reply to Mike-W-99:

> csv won't preserve much other than the data

csv will preserve all of the data. It won't preserve some of the logic, but logic is usually easy and quick to re-create once the thinking has already been done.

 Mike-W-99 11 Jun 2024
In reply to montyjohn:

It sounds like the original is full of macros. 

 montyjohn 11 Jun 2024
In reply to Mike-W-99:

OP said there are no macros

 Mike-W-99 11 Jun 2024
In reply to montyjohn:

The vlookups are in all but name. They sound pretty complicated.

 compost 11 Jun 2024
In reply to Yanchik:

Is it saved locally or in Sharepoint?

Edit: It took all my willpower not to ask whether you've tried turning it off and back on again

Post edited at 09:54
OP Yanchik 11 Jun 2024
In reply to Yanchik:

It's locally saved. It doesn't look up anything external, all the lookups are in tables in other tabs. 

It has no complex or conditional formatting (no need, and I'm aware complex formatting tends to mess stuff up.) Just the table format that comes when you select table. 

No macros. No COUNTIFs or SUMIFs or anything that needs repetitive sweeps down a column. A couple of pivot tables so I can do quick checks. 

Calculation has been set to manual for months now, it's a huge time-save. Takes about eight minutes to run. 

Mostly XLOOKUPs, some two-way INDEX MATCH. No VLOOKUPs or HLOOKUPs they're obsolete. 

64 bit setup.

It's got ~500K lines because that's the (compressed) minimum size of data for what I'm doing. Width, about 60 columns of which the first twenty are flat input data, the last 40 are the value-add. 

I don't need the data, I've got that because I've been trying to follow good practices throughout. Looks like I'll have to do a clean rebuild, which might or might not be a major faff rebuilding/relinking all the lookup references. Now the thing is a bit more mature, it shouldn't need a thousand more saves (is that where the problem's arising ?) 

If anyone's got any other "good practice for big Excel", gratefully received ! It turns out I'm the power user in my organisation (slightly surprising/alarming), the MS SARA recovery app is a complete waste of time, and that despite the huge university of training courses we are supposed to learn from, there's not much of actual relevance...

Y

Post edited at 10:45
 Rob Exile Ward 11 Jun 2024
In reply to Yanchik:

Really, I know you don't want to hear this but... have you any idea how straightforward and robust that application would be in Access? (Other relational databases are available, but Access probably has the simplest interface.) Excel was designed to do one job - financial modelling. The fact that it has been abused by millions of user, abetted by Microsoft, to handle large volumes of data doesn't make it right.

I think the amount of time spent by people trying to use Excel for something it was never designed to do, and the resulting frustration, time wasting and resulting errors,  is beyond decent contemplation.

 compost 11 Jun 2024
In reply to Yanchik:

> It's locally saved. It doesn't look up anything external, all the lookups are in tables in other tabs. 

Try saving it in Sharepoint/ Onedrive (or similar) and sharing with someone else to see if they get the same issues?

 kevin stephens 11 Jun 2024
In reply to Yanchik: I’ve found that a clean rebuild can be the best option when things go wrong with my very large spreadsheets, not least because you can structure them in a way that may make diagnostics a little easier. Unexpected incompatibilities can also be frustrating, for example Office 365 compared to standard office, for example conditional formatting from a list having to be on the same sheet as the list

Post edited at 10:59
 kevin stephens 11 Jun 2024
In reply to Rob Exile Ward:

> Really, I know you don't want to hear this but... have you any idea how straightforward and robust that application would be in Access? (Other relational databases are available, but Access probably has the simplest interface.) Excel was designed to do one job - financial modelling. The fact that it has been abused by millions of user, abetted by Microsoft, to handle large volumes of data doesn't make it right.

> I think the amount of time spent by people trying to use Excel for something it was never designed to do, and the resulting frustration, time wasting and resulting errors,  is beyond decent contemplation.

Like this:

https://www.bbc.co.uk/news/technology-54423988.amp

OP Yanchik 11 Jun 2024
In reply to Rob Exile Ward:

No idea. But it's a worthwhile thought, thanks. It feels like the entry barrier is lower and the skill more valuable (more widely applicable) than Python. I'll look into it... it's not a thought I'd have ever had for myself...

 Rob Exile Ward 11 Jun 2024
In reply to Yanchik:

Designing the tables is key, but interestingly neither complicated or technical - 'inspired common sense' it has been called. If you need some pointers, feel free to pm me - at one point I was going to write a book, working title 'Throw away that spreadsheet!' But only ever managed the first chapter.

OP Yanchik 11 Jun 2024
In reply to Rob Exile Ward:

Appreciated, thank you. I've just had an hour on how it would be implemented in Access from our internal resource... for whom we'd have to pay. But it looks a lot more accessible than I had expected, and a more valuable transferable skillset than being a mediocre Python coder...

Got some thinking to do...

Y

 mountainbagger 11 Jun 2024
In reply to Rob Exile Ward:

> at one point I was going to write a book, working title 'Throw away that spreadsheet!' But only ever managed the first chapter.

Did you start writing it in Excel and after one chapter it got too big??

OP: have you saved a copy without the pivot tables to see if that makes a difference?

 freeflyer 11 Jun 2024
In reply to Yanchik:

> Appreciated, thank you. I've just had an hour on how it would be implemented in Access from our internal resource... for whom we'd have to pay. But it looks a lot more accessible than I had expected, and a more valuable transferable skillset than being a mediocre Python coder...

> Got some thinking to do...

> Y

Get a quote from your internal resource for the implementation, including time spent with you shadowing his work so you can take it over when he's done. This would be money well spent, and has the potential to keep your project on track.

Good luck.

 Rob Exile Ward 11 Jun 2024
In reply to mountainbagger:

Kept on adding features!


New Topic
This topic has been archived, and won't accept reply postings.
Loading Notifications...