JON BURN
Jon has been programming with Microsoft Windows since the mid-1980s. Originally working with C, he now uses Visual Basic for all his programming tasks. He has worked on retail software, such as the PagePlus DTP package, and a lot of other custom software in the corporate environment. Jon has also taught programming and written various articles about it. He is currently working on graphics software for business presentations.
Microsoft Visual Basic 6 further enhances the Variant data type from the previous version so that it can now hold user-defined types (UDTs). This creates yet another reason why you should become familiar with Variants and what they can do. In this chapter I will take an in-depth look at Variants and discuss the benefits and pitfalls of programming with them.
Variants were first introduced in version 2 of Visual Basic as a flexible data type that could hold each of the simple data types. The Variant data type was extended substantially with version 4 to include Byte, Boolean, Error, Objects, and Arrays, and a little further with version 5 to include the Decimal data type. The Decimal data type was the first data type that was not available as a "first class" data type-it is available only within a Variant-and you cannot directly declare a variable as a Decimal.
In Visual Basic 6, UDTs have been added to the list, effectively completing the set. Now a Variant can be assigned any variable or constant, whatever the type.
A variety of functions convert to these subtypes and test for these subtypes. Table 4-1 shows the development of the Variant data type through the versions of Visual Basic, along with the matching functions.
Table 4-1 The Evolution of Variants
Type |
Visual Basic Name |
Visual Basic Version |
Convert Function |
Test Function |
Empty |
= Empty |
IsEmpty |
||
Null |
= Null |
IsNull |
||
Integer |
CInt |
IsNumeric* |
||
Long |
CLng |
IsNumeric |
||
Single |
CSng |
IsNumeric |
||
Double |
CDbl |
IsNumeric |
||
Currency |
CCur |
IsNumeric |
||
Date |
CVDate/CDate |
IsDate |
||
String |
CStr | |||
Object |
IsObject |
|||
Error |
CVErr |
IsError |
||
Boolean |
CBool | |||
Variant |
CVar | |||
Data Object | ||||
Decimal |
CDec |
IsNumeric |
||
17 747f519h |
Byte |
CByte | ||
UDT | ||||
Array |
Array |
IsArray |
||
ByRef |
Never? |
*Strictly speaking, IsNumeric tests to see if a variable can be converted to a numeric value, and is not simply reporting on a Variant's subtype.
A Variant always takes up at least 16 bytes of memory and is structured as shown in Figure 4-1.
Figure 4-1 The structure of a Variant
The first two bytes correspond to the value returned by the VarType function. (The VarType return values are defined as constants in the VbVarType enumeration.) For example, if the VarType is 2 (the value of the constant vbInteger), the Variant has a subtype of Integer. You cannot change this value directly, but the conversion functions (such as CInt) will do this for you.
The Reserved bytes have no documented function yet; their principal purpose is to pad the structure out to 16 bytes. The Data area holds the value of the variable, if the value fits into 8 bytes; otherwise, the Data area holds a pointer to the data (as with strings and so on). The type indicates how the Data portion of the Variant is to be understood or interpreted.
In this way, Variants are self-describing, meaning they contain within them all the information necessary to use them.
In this section I'll discuss the pros and cons of using Variants in place of simple data types such as Integer, Long, Double, and String. This is an unorthodox practice-the standard approach is to avoid the use of Variants for a number of reasons. We'll look at the counterarguments first.
Every journal article on optimizing Visual Basic includes a mention of how Variants are slower than underlying first-class data types. This should come as no surprise. For example, when iterating through a sequence with a Variant of subtype Integer, the interpreted or compiled code must decode the structure of the Variant every time the code wants to use its integer value, instead of accessing an integer value directly. There is bound to be an overhead to doing this.
Plenty of authors have made a comparison using a Variant as a counter in a For loop, and yes, a Variant Integer takes about 50 percent more time than an Integer when used as a loop counter. This margin decreases as the data type gets more complex, so a Variant Double is about the same as a Double, whereas, surprisingly, a Variant Currency is quicker than a Currency. If you are compiling to native code, the proportions can be much greater in certain cases.
Is this significant? Almost always it is not. The amount of time that would be saved by not using Variants would be dwarfed by the amount of time spent in loading and unloading forms and controls, painting the screen, talking to databases, and so on. Of course, this depends on the details of your own application, but in most cases it is highly unlikely that converting local variables from Variants to Integers and Strings will speed up your code noticeably.
When optimizing, you benefit by looking at the bigger picture. If your program is too slow, you should reassess the whole architecture of your system, concentrating in particular on the database and network aspects. Then look at user interface and algorithms. If your program is still so locally computation-intensive and time-critical that you think significant time can be saved by using Integers rather than Variants, you should be considering writing the critical portion in C++ and placing this in a DLL.
Taking a historical perspective, machines continue to grow orders of magnitude faster, which allows software to take more liberties with performance. Nowadays, it is better to concentrate on writing your code so that it works, is robust, and is extensible. If you need to sacrifice efficiency in order to do this, so be it-your code will still run fast enough anyway.
A common argument against Variants is that they take up more memory than do other data types. In place of an Integer, which normally takes just 2 bytes of memory, a Variant of 16 bytes is taking eight times more space. The ratio is less, of course, for other underlying types, but the Variant always contains some wasted space.
The question is, as with the issue of performance in the previous section, how significant is this? Again I think not very. If your program has some extremely large arrays-say, tens of thousands of integers-an argument could be made to allow Integers to be used. But they are the exception. All your normal variables in any given program are going to make no perceptible difference whether they are Variants or not.
I'm not saying that using Variants improves performance or memory. It doesn't. What I'm saying is that the effect Variants have is not a big deal-at least, not a big enough deal to outweigh the reasons for using them.
A more complex argument is the belief that Variants are poor programming style-that they represent an unwelcome return to the sort of dumb macro languages that encouraged sloppy, buggy programming.
The argument maintains that restricting variables to a specific type allows various logic errors to be trapped at compile time, an obviously good thing. Variants, in theory, take away this ability.
To understand this issue fully we must first look at the way non-Variant variables behave. In the following pages I have split this behavior into four key parts of the language, and have contrasted how Variants behave compared to simple data types in each of these four cases:
Function Calls
Operators and Expressions
Visual Basic Functions
Consider the following code fragment (Example A):
Dim i As Integer, s As StringWhat happens? Well, it depends on which version of Visual Basic you run. In pre-OLE versions of Visual Basic you got a Type mismatch error at compile time. In Visual Basic 6, there are no errors at compile time, but you get the Type mismatch trappable error 13 at run time when the program encounters the i = s line of code.
NOTE
Visual Basic 4 was rewritten using the OLE architecture; thus, versions 3 and earlier are "pre-OLE."
The difference is that the error occurs at run time instead of being trapped when you compile. Instead of you finding the error, your users do. This is a bad thing.
The situation is further complicated because it is not the fact that s is a String and i is an Integer that causes the problem. It is the actual value of s that determines whether the assignment can take place.
This code succeeds, with i set to 1234 (Example B):
Dim i As Integer, s As StringThis code in Example C does not succeed (you might have thought that i would be set to 0, but this is not the case):
Dim i as Integer, s As StringThese examples demonstrate why you get the error only at run time. At compile time the compiler cannot know what the value of s will be, and it is the value of s that decides whether an error occurs.
The behavior is exactly the same with this piece of code (Example D):
Dim i As Integer, s As StringAs in Example C, a type mismatch error will occur. In fact, Example C is exactly the same as Example D. In Example C, a hidden call to the CInt function takes place. The rules that determine whether CInt will succeed are the same as the rules that determine whether the plain i = s will succeed. This is known as implicit type conversion, although some call it "evil" type coercion.
The conversion functions CInt, CLng, and so on, are called implicitly whenever there is an assignment between variables of different data types. The actual functions are implemented within the system library file OLEAUT32.DLL. If you look at the exported functions in this DLL, you'll see a mass of conversion functions. For example, you'll see VarDecFromCy to convert a Currency to a Decimal, or VarBstrFromR8 to convert a string from an 8-byte Real, such as a Double. The code in this OLE DLL function determines the rules of the conversion within Visual Basic.
If the CInt function had worked the same way as Val does, the programming world would've been spared a few bugs (Example E).
Dim i As Integer, s As StringThis example succeeds because Val has been defined to return 0 when passed the empty string. The OLE conversion functions, being outside the mandate of Visual Basic itself, simply have different rules (Examples F and G).
Dim i As Integer, s As StringExamples F and G also yield different results. In Example F, i becomes 1, but in Example G, i becomes 1234. In this case the OLE conversion functions are more powerful in that they can cope with the thousands separator. Further, they also take account of the locale, or regional settings. Should your machine's regional settings be changed to German standard, Example G will yield 1 again, not 1234, because in German the comma is used as the decimal point rather than as a thousands separator. This can have both good and bad side effects.
These code fragments, on the other hand, succeed in all versions of Visual Basic (Examples H and I):
Dim i As Variant, s As VariantIn both the above cases, i is still a string, but why should that matter? By using Variants throughout our code, we eliminate the possibility of type mismatches during assignment. In this sense, using Variants can be even safer than using simple data types, because they reduce the number of run-time errors. Let's look now at another fundamental part of the syntax and again contrast how Variants behave compared to simple data types.
LOCALE EFFECTS
Suppose you were writing a little calculator program, where the user types a number into a text box and the program displays the square of this number as the contents of the text box change.
Private Sub Text1_Change()Note that the IsNumeric test verifies that it is safe to multiply the contents of the two text boxes without fear of type mismatch problems. Suppose "1,000" was typed into the text box-the label underneath would show 1,000,000 or 1, depending on the regional settings. On the one hand, it's good that you get this international behavior without performing any extra coding, but it could also be a problem if the user was not conforming to the regional setting in question. Further, to prevent this problem, if a number is to be written to a database or file, it should be written as a number without formatting, in case it is read at a later date on a machine where the settings are different.
Also, you should also avoid writing any code yourself that parses numeric strings. For example, if you were trying to locate the decimal point in a number using string functions, you might have a problem:
InStr(53.6, ".")This line of code will return 3 on English/American settings, but 0 on German settings.
Note, finally, that Visual Basic itself does not adhere to this convention in its own source code. The number 53.6 means the same whatever the regional settings. We all take this for granted, of course.
Consider the following procedure:
Sub f(ByVal i As Integer, ByVal s As String)This procedure is called by the following code:
Dim i As Integer, s As StringYou'll notice I put the parameters in the wrong order.
With pre-OLE versions of Visual Basic you get a Parameter Type Mismatch error at compile time, but in Visual Basic 4, 5, and 6 the situation is the same as in the previous example-a run-time type mismatch, depending on the value in s, and whether the implicit CInt could work.
Instead, the procedure could be defined using Variants:
Sub f(ByVal i As Variant, ByVal s As Variant)The problem is that you might reasonably expect that after assigning 6.4 to x in the procedure subByRef, which is declared in the parameter list as a Variant, Debug.Print would show 6.4. But instead it shows only 6.
Now no run-time errors or compile-time type mismatch errors occur. Of course, it's not necessarily so obvious by looking at the declaration what the parameters mean, but then that's what the parameter name is for.
Returning to our survey of how Variants behave compared to simple data types, we now look at expressions involving Variants.
I have already suggested, for the purposes of assignment and function parameters and return values, that using Variants cuts down on problematic run-time errors. Does this also apply to the use of Visual Basic's own built-in functions and operators? The answer is, "It depends on the operator or function involved."
Arithmetic operators All the arithmetic operators (such as +, -, *, \, /, and ^) evaluate their parameters at run time and throw the ubiquitous type mismatch error if the parameters do not apply. With arithmetic operators, there is neither an advantage nor a disadvantage to using Variants instead of simple data types; in either case, it's the value, not the data type, that determines whether the operation can take place. In Example A, we get type mismatch errors on both lines:
Dim s As String, v As VariantBut in Example B, these lines both succeed:
Dim s As String, v As VariantA lot of implicit type conversion is going on here. The parameters of "-" are converted at run time to Doubles before being supplied to the subtraction operator itself. CDbl("Fred") does not work, so both lines in Example A fail. CDbl("123") does work, so the subtraction succeeds in both lines of Example B.
There is one slight difference between v and s after the assignments in Example B: s is a string of length 1 containing the value 0, while v is a Variant of subtype Double containing the value 0. The subtraction operator is defined as returning a Double, so 0 is returned in both assignments. This is fine for v - v, which becomes a Variant of subtype Double, with value 0. On the other hand, s is a string, so CStr is called to convert the Double value to 0.
All other arithmetic operators behave in a similar way to subtraction, with the exception of +.
Option "Strict Type Checking"
Some other authors have argued for the inclusion of another option along the lines of "Option Explicit" that would enforce strict type checking. Assignment between variables of different types would not be allowed and such errors would be trapped at compile time. The conversion functions such as CInt and CLng would need to be used explicitly for type conversion to take place.
This would effectively return the Visual Basic language to its pre-OLE style, and Examples A, B, and C would all generate compile-time errors. Example D would still return a run-time type mismatch, however.
Examples E, F, and G would succeed with the same results as above. In other words, code using Variants would be unaffected by the feature.
Comparison operators We normally take the comparison operators (such as <, >, and =) for granted and don't think too much about how they behave. With Variants, comparison operators can occasionally cause problems.
The comparison operators are similar to the addition operator in that they have behavior defined for both numeric and string operands, and unfortunately this behavior is different.
A string comparison will not necessarily give the same result as numeric comparison on the same operands, as the following examples show:
Dim a, b, a1, b1Notice also that all four variables-a, b, a1, and b1-are numeric in the sense that IsNumeric will return True for them.
As with string and number addition, the net result is that you must always be aware of the potential bugs here and ensure that the operands are converted to a numeric or string subtype before the operator is used.
Visual Basic's own functions work well with Variants, with a few exceptions. I won't cover this exhaustively but just pick out some special points.
The Visual Basic mathematical functions works fine with Variants because they each have a single behavior that applies only to numerics, so there is no confusion. In this way, these functions are similar to the arithmetic operators. Provided the Variant passes the IsNumeric test, the function will perform correctly, regardless of the underlying subtype.
a = Hex("1,234")Type mismatch errors will be raised should the parameter not be numeric.
The string functions do not raise type mismatch errors, because all simple data types can be converted to strings (for this reason there is no IsString function in Visual Basic). Thus, you can apply the string functions to Variants with numeric subtypes-Mid, InStr, and so forth all function as you would expect. However, exercise extreme caution because of the effect regional settings can have on the string version of a numeric. (This was covered earlier in the chapter.)
The function Len is an interesting exception, because once again it has different behavior depending on what the data type of the parameter is. For simple strings Len returns the length of the string. For simple nonstring data Len returns the number of bytes used to store the variable. However, less well known is the fact that for Variants, it returns the length of the Variant as if it were converted to a string, regardless of the Variant's actual subtype.
Dim v As Variant, i As IntegerThis provides one of the only ways of distinguishing a simple Integer variable from a Variant of subtype Integer at run time.
Some time ago, while I was working for a big software house, I heard this (presumably exaggerated) anecdote about how the company had charged a customer $1 million to upgrade the customer's software. The customer had grown in size, and account codes required five digits instead of four. That was all there was to it. Of course, the client was almost certainly being ripped off, but there are plenty of examples in which a little lack of foresight proves very costly to repair. The Year 2000 problem is a prime example. It pays to allow yourself as much flexibility and room for expansion that can be reasonably foreseen. For example, if you need to pass the number of books as a parameter to a function, why only allow less than 32,768 books (the maximum value of an Integer)? You might also need to allow for half a book too, so you wouldn't want to restrict it to Integer or Long. You'd want to allow floating-point inputs. You could at this point declare the parameter to be of type Double because this covers the range and precision of Integer and Long as well as handling floating points. But even this approach is still an unnecessary restriction. Not only might you still want the greater precision of Currency or Decimal, you might also want to pass in inputs such as An unknown number of books.
The solution is to declare the number of books as a Variant. The only commitment that is made is about the meaning of the parameter-that it contains a number of books-and no restriction is placed on that number. As much flexibility as possible is maintained, and the cost of those account code upgrades will diminish.
Function ReadBooks(ByVal numBooks As Variant)Suppose we want to upgrade the function so that we can pass An unknown number of books as a valid input. The best way of doing this is to pass a Variant of subtype Null. Null is specifically set aside for the purpose of indicating not known.
If the parameter had not been a Variant, you would have had some choices:
Allow a special value to indicate unknown-perhaps -1 or maybe 32768. We might create a constant of this value so that the code reads a little better-Const bkNotKnown = -1-and use that. This approach leads to bugs. Sooner or later, you or another programmer will forget that -1 is reserved and use it as an ordinary value of number of books, however unlikely that may seem at the time you choose the value of the constant.
If the parameters are Variants, you avoid these unsatisfactory choices when modifying the functions. In the same way, parameters and return types of class methods, as well as properties, should all be declared as Variants instead of first-class data types.
HUNGARIAN NOTATION
The portion of Hungarian notation that refers to data type has little relevance when programming with Variants. Indeed, as variables of different data types can be freely assigned and interchanged, the notation has little relevance in Visual Basic at all.
I still use variable prefixes, but only to assist in the categorization of variables at a semantic level. So, for example, "nCount" would be a number that is used as a counter of something. The n in this instance stands for a general numeric, not an Integer.
I have extolled the virtues of using Variants and the flexibility that they give. To be more precise, they allow the interface to be flexible. By declaring the number of books to be a Variant, you make it unlikely that the data type of that parameter will need to be modified again.
This flexibility of Variants has a cost to it. What happens if we call the function with an input that doesn't make sense?
N = ReadBooks("Ugh")Inside the function, we are expecting a number-so what will it make of this? If we are performing some arithmetic operations on the number, we risk a type mismatch error when a Variant with these contents is passed. You must assert your preconditions for the function to work. If, as in this instance, the input must be numeric, be sure that this is the case:
Function ReadBooks(ByVal input As Variant) As VariantIn other words, you code defensively by using the set of Is functions to verify that a parameter is suitable for the operation you're going to perform on it.
You might think about using Debug.Assert in this instance, but it is no help at run time because all the calls to the Assert method are stripped out in compilation. So you would still need to implement your own checks anyway.
Of course, verifying that your input parameter is appropriate and satisfies the preconditions is not just about checking the type. It would also involve range checks, ensuring that we are not dividing by 0, and so on.
Is this feasible? In practice, coding defensively like this can become a major chore, and it is easy to slip up or not bother with it. It would be prudent if you were writing an important1 piece of component code, especially if the interface is public, to place defensive checks at your component entry points. But it is equally likely that a lot of the time you will not get around to this.
What are the consequences of not performing the defensive checks? While this naturally depends on what you are doing in the function, it is most likely that if there is an error it will be a type mismatch error. If the string Ugh in the previous example was used by an operator or built-in function that only worked with numerics, a type mismatch would occur. Interestingly, had the parameter to ReadBooks been declared as a Double instead of a Variant, this same error would be raised if the string Ugh was passed.
The only difference is that in the case of the Variant the error is raised within the function, not outside it. You have the choice of passing this error back to the calling client code or just swallowing the error and carrying on. The approach you take will depend on the particular circumstances and your preferences.
Don't get sidetracked by irrelevant machine-specific details. Almost all the time, we want to deal with numbers. For example, consider your thought process when you choose between declaring a variable to be of type Integer or type Long. You might consider what the likely values of the variable are going to be, worry a little bit about the effect on performance or memory usage, and maybe check to see how you declared a similar variable elsewhere so that you can be consistent. Save time-get into the habit of declaring all these variables as Variants.
NOTE
All variables in my code are either Variants or references to classes. Consequently, a lot of code starts to look like this.
Dim Top As VariantAfter a time I started to take advantage of the fact that Variants are the default, so my code typically now looks like this:
Dim Top, Left, Width, HeightI see no problem with this, but your current Visual Basic coding standards will more than likely prohibit it. You might think about changing them.
VARIANT BUGS WHEN PASSING PARAMETERS BY REFERENCE
Variants do not always work well when passed by reference, and can give rise to some hard-to-spot bugs. The problem is illustrated in the following example:
Private Sub Form_Load()Notice that the only difference between the procedures subByVal and subByRef is that the parameter is passed ByVal in subByVal and ByRef in subByRef. When subByVal is called, the actual parameter i is of type Integer. In subByVal, a new parameter x is created as a Variant of subtype Integer, and is initialized with the value 3. In other words, the subtype of the Variant within the procedure is defined by the type of the variable that the procedure was actually called with. When x is then set to a value of 6.4, it converts to a Variant of subtype Double with value 6.4. Straightforward.
When subByRef is called, Visual Basic has a bit more of a problem. The Integer is passed by reference, so Visual Basic cannot allow noninteger values to be placed in it. Instead of converting the Integer to a Variant, Visual Basic leaves it as an Integer. Thus, even in the procedure subByRef itself, where x is declared as a Variant, x is really an Integer. The assignment of x = 6.4 will result in an implicit CInt call and x ends up with the value 6. Not so straightforward.
Procedures like subByVal are powerful because they can perform the same task, whatever the data type of the actual parameters. They can even perform different tasks depending on the type of the actual parameter, though this can get confusing.
Procedures like subByRef lead to bugs-avoid them by avoiding passing by reference.
Earlier in the chapter, I extolled the use of Variants in the place of simple data types like Integer and String. Does the same argument apply for objects?
Put simply, the answer is no, because there is considerable extra value added by declaring a variable to be of a specific object type. Unlike the simple data types, we can get useful compile-time error messages that help prevent bugs. If the Variant (or Object) data type was used, these errors would surface only at run time-a bad thing.
By way of explanation, consider the following simple example. In this project there is one class, called Cow, which has few properties, such as Age, TailLength, and so forth. We then create a routine
Private Sub AgeMessage(c As Cow)If you accidentally misspell Age and instead type
MsgBox c.Aggprovided c is declared as Cow, you will receive a compile-time error message so that you can correct it. If the parameter was declared as a Variant (or Object), Visual Basic cannot know whether there is a legitimate property of c called Agg until, at run time, it actually knows what the object is. Hence, all you get is a run-time error 438 instead.
Notice how this argument does not apply back to simple data types. Although simple data types do not have properties, they do have certain operators that may or may not be well defined for them. However, a piece of code such as this
Dim s As Stringwhere the * operator is undefined for strings, will result in a run-time type mismatch, not a compile-time error. So the advantage of not declaring as Variant is lost.
Flexibility is the fundamental reason to use Variants. But the built-in flexibility of Variants is not advertised enough, and consequently they tend to be underused. The use of Empty, Null, and Variant arrays-and now in version 6, UDTs-remain underused in the Visual Basic programmer community.
Any uninitialized Variant has the Empty value until something is assigned to it. This is true for all variables of type Variant, whether Public, Private, Static, or local. This is the first feature to distinguish Variants from other data types-you cannot determine whether any other data type is uninitialized.
As well as testing for VarType zero, a shorthand function exists-IsEmpty-which does the same thing but is more readable.
In early versions of Visual Basic, once a Variant was given a value, the only way to reset it to Empty was to assign it to another Variant that itself was empty. In Visual Basic 5 and 6, you can also set it to the keyword Empty, as follows:
v1 = EmptyI like Empty, although I find it is one of those things that you forget about and sometimes miss opportunities to use. Coming from a C background, where there is no equivalent, isn't much help either. But it does have uses in odd places, so it's worth keeping it in the back of your mind. File under miscellaneous.
Of course, Null is familiar to everyone as that database "no value" value, found in all SQL databases. But as a Variant subtype it can be used to mean no value or invalid value in a more general sense-in fact, in any sense that you want to use it. Conceptually, it differs from Empty in that it implies you have intentionally set a Variant to this value for some reason, whereas Empty implies you just haven't gotten around to doing anything with the Variant yet.
As with Empty, you have an IsNull function and a Null keyword that can be used directly.
Visual Basic programmers tend to convert a variable with a Null value-read, say from a database-to something else as quickly as possible. I've seen plenty of code where Null is converted to empty strings or zeros as soon as it's pulled out of a recordset, even though this usually results in information loss and some bad assumptions. I think this stems from the fact that the tasks we want to perform with data items-such as display them in text boxes or do calculations with them-often result in the all too familiar error 94, "Invalid use of Null."
This is exacerbated by the fact that Null propagates through expressions. Any arithmetic operator (+, -, *, /, \, Mod, ^) or comparison operator (<, >, =, <>) that has a Null as one of its operands will result in a Null being the value of the overall expression, irrespective of the type or value of the other operand. This can lead to some well-known bugs, such as:
v = NullIn this code, the message "Hi" will not be displayed because as v is Null, and = is just a comparison operator here, the value of the expression v = Null is itself Null. And Null is treated as False in If...Then clauses.
The propagation rule has some exceptions. The string concatenation operator & treats Null as an empty string "" if one of its operands is a Null. This explains, for example, the following shorthand way of removing Null when reading values from a database:
v = "" & vThis will leave v unchanged if it is a string, unless it is Null, in which case it will convert it to "".
Another set of exceptions is with the logical operators (And, Eqv, Imp, Not, Or, Xor). Here Null is treated as a third truth value, as in standard many-valued logic. Semantically, Null should be interpreted as unsure in this context, and this helps to explain the truth tables. For example:
v = True And Nullgives v the value Null, but
v = True Or Nullgives v the value True. This is because if you know A is true, but are unsure about B, then you are unsure about A and B together, but you are sure about A or B. Follow?
By the way, watch out for the Not operator. Because the truth value of Null lies halfway between True and False, Not Null must evaluate to Null in order to keep the logical model consistent. This is indeed what it does.
v = Not NullThat's about all on Null-I think it is the trickiest of the Variant subtypes, but once you get to grips with how it behaves, it can add a lot of value.
Arrays are now implemented using the OLE data type named SAFEARRAY. This is a data type that, like Variants and classes, allows arrays to be self-describing. The LBound and number of elements for each dimension of the array are stored in this structure. Within the inner workings of OLE, all access to these arrays is through an extensive set of API calls implemented in the system library file OLEAUT32.DLL. You do not get or set the array elements directly, but you use API calls. These API calls use the LBound and number of elements to make sure they always write within the allocated area. This is why they are safe arrays-attempts to write to elements outside the allowed area are trapped within the API and gracefully dealt with.2
The ability to store arrays in Variants was new to Visual Basic 4, and a number of new language elements were introduced to support them such as Array and IsArray.
To set up a Variant to be an array, you can either assign it to an already existing array or use the Array function. The first of these methods creates a Variant whose subtype is the array value (8192) added to the value of the type of the original array. The Array function, on the other hand, always creates an array of Variants-VarType 8204 (which is 8192 plus 12).
The following code shows three ways of creating a Variant array of the numbers 0, 1, 2, 3:
Dim v As VariantNotice that the only difference between the last two arrays is that one is a Variant holding an array of integers and the other is a Variant holding an array of Variants. It can be easy to get confused here, look at the following:
ReDim a(5) As VariantThis code is creating an array of Variants, but this is not a Variant array. What consequence does this have? Not much anymore. Before version 6 you could utilize array copying only with Variant arrays, but now you can do this with any variable-sized array.
So what is useful about placing an array in a Variant? As Variants can contain arrays, and they can be arrays of Variants, those contained Variants can themselves be arrays, maybe of Variants, which can also be arrays, and so on and so forth.
Just how deep can these arrays be nested? I don't know if there is a theoretical limit, but in practice I have tested at least 10 levels of nesting. This odd bit of code works fine:
Dim v As Variant, i As IntegerHow do these compare to more standard multidimensional arrays? Well, on the positive side, they are much more flexible. The contained arrays-corresponding to the lower dimensions of a multidimensional array-do not have to have the same number of elements. Figure 4-2 explains the difference pictorially.
Figure 4-2 The difference between a standard two-dimensional array (top) and a Variant array (bottom)
These are sometimes known as ragged arrays. As you can see from the diagram, we do not have all the wasted space of a multidimensional array. However you have to contrast that with the fact that the Variant "trees" are harder to set up.
This ability of Variants to hold arrays of Variants permits some interesting new data structures in Visual Basic. One obvious example is a tree. In this piece of code, an entire directory structure is folded up and inserted in a single Variant:
Private Sub Form_Load()Using + for String Concatenation
This misconceived experiment with operator overloading was considered bad form even back in the days of Visual Basic 2, when the string concatenation operator & was first introduced. Yet it's still supported in Visual Basic 6. In particular, since version 4 brought in extensive implicit type conversion between numerics and strings, this issue has become even more important. It's easy to find examples of how you can get tripped up. Can you honestly be confident of what the following will print?
Debug.Print "56" + 48What should happen is that adding two strings has the same effect as subtracting, multiplying, or dividing two strings-that is, the addition operator should treat the strings as numeric if it can; otherwise, it should generate a type mismatch error. Unfortunately, this is not the case. The only argument for why the operator stays in there, causing bugs, is backward compatibility.
One point to note about this code is that this is an extremely efficient way of storing a tree structure, because as v is a multidimensional ragged array, the structure contains less wasted space than its equivalent multidimensional fixed-sized array. This contrasts with the accusation usually leveled at Variants, that they waste a lot of memory space.
The rehabilitation of UDTs was the biggest surprise for me in version 6 of Visual Basic. It had looked as if UDTs were being gradually squeezed out of the language. In particular, the new language features such as classes, properties, and methods did not seem to include UDTs. Before version 6, it was not possible to
pass a UDT as a parameter ByVal to a sub or function.
have a UDT as a parameter to a public method of a class or form.
have a UDT as the return type of a public method of a class or form.
place a UDT into a Variant.
But this has suddenly changed and now it is possible in version 6 to perform most of these to a greater or lesser extent. In this chapter, I am really only concentrating on the last point, that of placing a UDT into a Variant.
Restrictions are imposed on the sorts of UDTs that can be placed in a Variant. They must be declared within a public object module. This rules out their use within Standard EXE programs, as these do not have public object modules. This is a Microsoft ActiveX-only feature. Internally, the Data portion of the Variant structure is always a simple pointer to an area of memory where the UDT's content is sitting. The Type is always 36. This prompts the question of where and how the meta-data describing the fields of the UDT is kept. Remember that all other Variant subtypes are self-describing, so UDTs must be, too. The way it works is that from the Variant you can also obtain an IRecordInfo interface pointer. That interface has functions that return everything you want to know about the UDT.
We are able to improve substantially on the nesting ability demonstrated earlier with Variant arrays. While it is still impossible to have a member field of a UDT be that UDT itself-a hierarchy that is commonly needed-you can use a Variant and sidestep the circular reference trap. The following code shows a simple example of an employee structure (Emp) in an imaginary, not-so-progressive organization (apologies for the lack of originality). The boss and an array of workers are declared as Variant-these will all in fact be Emps themselves. GetEmp is just a function that generates Emps.
' In Class1Note that this code uses the ability to return a UDT from a function. Also, the Array function always creates an array of Variants, so this code now works because we can convert the return value of GetEmp to a Variant.
Interface Inviolability
If you're like me, you may well have experienced the frustration of creating ActiveX components (in-process or out-of-process, it doesn't matter) and then realizing you need to make a tiny upgrade.
You don't want to change the interface definition because then your server is no longer compatible, the CLSID has changed, and you get into all the troublesome versioning complexity. Programs and components that use your component will all have problems or be unable to automatically use your upgraded version.
There isn't a lot you can do about this. Visual Basic imposes what is a very good discipline on us with its version compatibility checking, though it is sometimes a bitter pill to swallow.
In this respect, the flexibility gained by using Variants for properties and methods' parameters can be a great headache saver.
One drawback to this is that Visual Basic does not know at compile time the actual type of Workers, so you might write errors that will not be found until run time, such as the following:
a.Workers.qwert = 74Accessing an invalid property like this will not be caught at compile time. This is analagous to the behavior of using Variants to hold objects described earlier. Similarly, the VarType of a.Workers is 8204-vbArray + vbVariant. Visual Basic does not know what is in this array. If we rewrote the above code like this:
' In Class1This time the VarType of a.Workers is 8228-vbArray + vbUserDefinedType. In other words, Visual Basic knows that Workers is an array of Emps, not an array of Variants. This has similarities to the late-bound and early-bound issue with objects and classes. (See "How Binding Affects ActiveX Component Performance" in the Visual Basic Component Tools Guide.) At compile time, however, the checking of valid methods and properties is still not possible because the underlying declaration is Variant.
The alternative way of implementing this code would be to create a class called Emp that had other Emps within it-I'm sure you've often done something similar to this. What I find interesting about the examples above is the similarity they have with this sort of class/object code-but no objects are being created here. We should find performance much improved over a class-based approach because object creation and deletion still take a relatively long time in Visual Basic. This approach differs slightly in that an assignment from one Variant containing a UDT to another Variant results in a deep copy of the UDT. So in the above examples, if you copy an Emp, you get a copy of all the fields and their contents. With objects, you are just copying the reference and there is still only one underlying object in existence. Using classes rather than UDTs for this sort of situation is still preferable given the many other advantages of classes, unless you are creating hundreds or thousands of a particular object. In this case, you might find the performance improvement of UDTs compelling.
MORE ON PASSING PARAMETER BY REFERENCE
You might be wondering, "Why should I avoid passing parameters by reference? It's often very useful." In many situations, passing parameters by reference is indicative of bad design. Just as using global variables is bad design but can be the easy or lazy way out, passing parameters by reference is a shortcut that often backfires at a later date.
Passing parameters by reference is a sign that you don't have the relationships between your functions correct. The mathematical model of a function is of the form:
x = f (a,b,c,..)where the function acts on a,b,c, and so on to produce result x. Both sides of the equal sign are the same value. You can use either x or f(a,b,c,...) interchangeably.
Likewise in Visual Basic, functions can be used as components in expressions, as this example shows:
x = Sqr(Log(y))This is not quite the case in a Visual Basic program, because the function does something in addition to returning a value. But it's still most useful to think of the return value x as the result of what that function f does. But if x contains the result, the result cannot also be in a, b, or c. In other words, only x is changed by the function. This is my simplistic conceptual model of a function, and it is at odds with the notion of passing by reference. Passing a parameter by reference often indicates one of the following:
This is going to lead to larger functions than need be, functions that are more complex than need be, and functions that are not as useful as they could be. You should break down the functions so that each one does only one task, as in the mathematical model.
If the function needs to return two related values, say an X and a Y value for a coordinate, create a class or UDT to hold the object that these values relate to, and return that. If the values are not sufficiently related to be able to define a class, you are almost certainly doing too much in the one function. As an example, this
GetCenter(f As Form, ByRef X, ByRef Y)would be better written as
Set p = GetCenter(f As Form)where p is an object of class Point. Alternatively
p = GetCenter(f as Form)here p is a Variant UDT.
By meta-data I mean a description of the data returned. Functions won't need to return meta-data if you use only self-describing data types. For example, functions that return an array or a single element, depending upon some argument, should return a Variant, which can hold either, and the caller can use IsArray to determine what sort of data is returned.
It is quite common to use the return value of a function to return True, False, or perhaps an error code. The actual data value is returned as a parameter by reference. For example, consider this code fragment:
bRet = GetFileVersion(ByRef nVersion, filename)Here the version of filename is returned by reference, provided the file was found correctly. If the file was not found, the function will return False and nVersion will not be accurate. You have a couple of alternatives.
Raise errors. This has always been the Visual Basic way. (For example, Visual Basic's own Open works in this way.)
Return a Variant and use the CVErr function to create a Variant of subtype vbError holding the error condition. The caller can then use IsError as follows:
nVersion = GetFileVersion(filename)If IsError(nVersion) Then
' nVersion is unreliable, take some action...
Else
' nVersion is reliable, use it...
End If
Error raising and trapping is not to everyone's taste, so the second option might appeal to you.
Bugs result when instances of parameters unexpectedly change their value. To put it at its simplest, parameters changing their value are a side effect, and side effects trip up programmers. If I call the function
Dim a, bit is immediately apparent to me that a is likely to change, and probably b will not. But I cannot guarantee that without looking at the source code for the function f.
It is particularly unfortunate that the default parameter-passing convention is ByRef, because it means that you have to do extra typing to avoid ByRef parameters. Because of backward compatibility, I don't think there's any chance of this being changed. However, it has been suggested by the Visual Basic team that they could fill in the ByRefs automatically on listback in the editor so that people can see exactly what's happening. The problem is that this would alter all old code as it is brought into the editor and cause problems with source code control.
|