Top string Questions

List of Tags
1019
gramm

How can I check if one string contains another substring in JavaScript?

Usually I would expect a String.contains() method, but there doesn't seem to be one.


Update: It seems that I have another problem.

When I use the ".indexof" method, Firefox refuses to start the JavaScript code (this is for an extension).

My code is:

var allLinks = content.document.getElementsByTagName("a");

for (var i=0, il=allLinks.length; i<il; i++) {
    elm = allLinks[i];
    var test = elm.getAttribute("class");
    if (test.indexof("title") !=-1) {
        alert(elm);
        foundLinks++;
    }
}

if (foundLinks === 0) {
    alert("No title class found");
} 
else {
    alert("Found " + foundLinks + " title class");
}

Firefox doesn't display an alert box. This works if I get rid of the .indexof() method. I already tried something like if (test=="title")..., but it didn't work.

Answered By: Fabien M&#233;nager ( 1602)

indexOf returns the position of the string in the other string. If not found, it will return -1:

var s = "foo";
alert(s.indexOf("oo") !== -1);
732
Lance Fisher

In C#, what is the difference between String and string? (note the case)

Example:

string s = "Hello, World";

String S = "Hello, World";

Also, what are the guidelines for the use of each?

Answered By: Derek Park ( 719)

string is an alias for System.String. So technically, there is no difference. It's like int vs. System.Int32.

As far as guidelines, I think it's generally recommended to use string any time you're referring to an object. e.g.

string place = "world";

Likewise, I think it's generally recommended to use String if you need to refer specifically to the class. e.g.

string greet = String.Format("Hello {0}!", place);

This is the style that Microsoft tends to use in their examples.


It appears that the guidance in this area may have changed, as StyleCop now enforces the use of the C#-specific aliases.

605
Johnny Maelstrom

If you have java.io.InputStream object, how should you process that object and produce a String?


Suppose I have an InputStream that contains text data, and I want to convert this to a String (for example, so I can write the contents of the stream to a log file).

What is the easiest way to take the InputStream and convert it to a String?

public String convertStreamToString(InputStream is) { 
    // ???
}
Answered By: Pavel Repin ( 614)

Here's a way using only standard Java library (note that the stream is not closed, YMMV).

public static String convertStreamToString(java.io.InputStream is) {
    java.util.Scanner s = new java.util.Scanner(is).useDelimiter("\\A");
    return s.hasNext() ? s.next() : "";
}

I learned this trick from "Stupid Scanner tricks" article. The reason it works is because Scanner iterates over tokens in the stream, and in this case we separate tokens using "beginning of the input boundary" (\A) thus giving us only one token for the entire contents of the stream.

Note, if you need to be specific about the input stream's encoding, you can provide the second argument to Scanner ctor that indicates what charset to use (e.g. "UTF-8").

Hat tip goes also to Jacob, who once pointed me to the said article.

EDITED: Thanks to a suggestion from Patrick, made the function more robust when handling an empty input stream. One more edit: nixed try/catch, Patrick's way is more laconic.

602
Ahamed

In Swing, the password field has a getPassword() (returns char[]) method instead of the usual getText() (returns String) method. Similarly, I have come across a suggestion not to use Strings to handle passwords. Why does String pose a threat to security when it comes to passwords?

It feels inconvenient to use char[].

Answered By: Jon Skeet ( 836)

Strings are immutable. That means once you've created the string, if another process can dump memory, there's no way (aside from reflection) you can get rid of the data before GC kicks in.

With an array, you can explicitly wipe the data after you're done with it: you can overwrite the array with anything you like, and the password won't be present anywhere in the system, even before garbage collection.

So yes, this is a security concern - but even using char[] only reduces the window of opportunity for an attacker, and it's only for this specific type of attack.

EDIT: As noted in comments, it's possible that arrays being moved by the garbage collector will leave stray copies of the data in memory. I believe this is implementation-specific - the GC may clear all memory as it goes, to avoid this sort of thing. Even if it does, there's still the time during which the char[] contains the actual characters as an attack window.

568
abcXYZ

I came across this piece of code, and found it rather interesting. The following print statement would print "hello world". Could anyone explain this?

System.out.println(randomString(-229985452) + " " + randomString(-147909649));

And randomString() looks like this

public static String randomString(int i)
{
    Random ran = new Random(i);
    StringBuilder sb = new StringBuilder();
    for (int n = 0; ; n++)
    {
        int k = ran.nextInt(27);
        if (k == 0)
            break;

        sb.append((char)('`' + k));
    }

    return sb.toString();
}
Answered By: Eng.Fouad ( 570)

The other answers explain why, but here is how:

new Random(-229985452).nextInt(27)

The first 6 numbers that the above random generates are:

8
5
12
12
15
0

and the first 6 numbers that new Random(-147909649).nextInt(27) generates are:

23
15
18
12
4
0

Then just add those numbers to the integer representation of the character ` (which is 96):

8  + 96 = 104 --> h
5  + 96 = 101 --> e
12 + 96 = 108 --> l
12 + 96 = 108 --> l
15 + 96 = 111 --> o

23 + 96 = 119 --> w
15 + 96 = 111 --> o
18 + 96 = 114 --> r
12 + 96 = 108 --> l
4  + 96 = 100 --> d
425
Robert Wills

I want to capitalize the first character of a string, but not change the case of any of the other letters. For example:

  • this is a test -> This is a test
  • the Eiffel Tower -> The Eiffel Tower
  • /index.html -> /index.html
Answered By: Steve Harrison ( 651)

Another solution:

function capitaliseFirstLetter(string)
{
    return string.charAt(0).toUpperCase() + string.slice(1);
}
411
Derek

How can I convert a JavaScript string value to be in all lower case letters?

Example: "Your Name" to "your name"

Answered By: John Topley ( 609)
"Your Name".toLowerCase();
361
Agnel Kurian

How do I convert a string to a byte array in .NET (C#)?

Update: Also please explain why encoding should be taken into consideration. Can't I simply get what bytes the string has been stored in? Why this dependency on encoding?!!!

Answered By: Mehrdad ( 243)

Contrary to the answers here, you DON'T need to worry about encoding!

Like you mentioned, your goal is, simply, to "get what bytes the string has been stored in".
(And, of course, to be able to re-construct the string from the bytes.)

For those goals, I honestly do not understand why people keep telling you that you need the encodings. You certainly do NOT need to worry about encodings for this.

Just do this instead:

static byte[] GetBytes(string str)
{
    byte[] bytes = new byte[str.Length * sizeof(char)];
    System.Buffer.BlockCopy(str.ToCharArray(), 0, bytes, 0, bytes.Length);
    return bytes;
}

static string GetString(byte[] bytes)
{
    char[] chars = new char[bytes.Length / sizeof(char)];
    System.Buffer.BlockCopy(bytes, 0, chars, 0, bytes.Length);
    return new string(chars);
}

As long as your program (or other programs) don't try to interpret the bytes somehow, which you obviously didn't mention you intend to do, then there is nothing wrong with this approach! Worrying about encodings just makes your life more complicated for no real reason.

Additional benefit to this approach:

It doesn't matter if the string contains invalid characters, because you can still get the data and reconstruct the original string anyway!

It will be encoded and decoded just the same, because you are just looking at the bytes.

If you used a specific encoding, though, it would've given you trouble with encoding/decoding invalid characters.

I wanted to add the elements of an array into another, so I tried this simple sentence in our beloved Firebug:

[1,2] + [3,4]

It responded with:

"1,23,4"

What is going on?

Answered By: Saul ( 403)

The + operator is not defined for arrays.

What happens is that Javascript converts arrays into strings and concatenates those.

 

Update

Since this question and consequently my answer is getting a lot of attention I felt that in addition to the insightful stuff posted by Jeremy Banks it would be useful to have an overview about how the + operator behaves in general.

So, here it goes.

Excluding E4X and implementation-specific stuff, JavaScript has 6 built-in data types:

  1. undefined
  2. boolean
  3. number
  4. string
  5. function
  6. object

Note that neither null nor [] is a separate type - both return object when fed to typeof. However + works differently in either case.

That's right - JavaScript has no primitive arrays as such; only instances of a class called Array with some syntactic sugar to ease the pain.

Adding more to the confusion, wrapper entities such as new Number(5), new Boolean(true) and new String("abc") are all of object type, not numbers, booleans or strings as one might expect. Nevertheless for arithmetic operators Number and Boolean behave as numbers.

Easy, huh? With all that out of the way, we can move on to the overview itself.

Different result types of + by operand types

-------------------------------------------------------------------------------------------
            | undefined | boolean | number | string | function | object | null   | array  | 
-------------------------------------------------------------------------------------------

undefined   | number    | number  | number | string | string   | string | number | string | 

boolean     | number    | number  | number | string | string   | string | number | string | 

number      | number    | number  | number | string | string   | string | number | string | 

string      | string    | string  | string | string | string   | string | string | string | 

function    | string    | string  | string | string | string   | string | string | string | 

object      | string    | string  | string | string | string   | string | string | string | 

null        | number    | number  | number | string | string   | string | number | string | 

array       | string    | string  | string | string | string   | string | string | string | 

-------------------------------------------------------------------------------------------

* this applies to Chrome 13, Firefox 6, Opera 11 and IE9. Checking other browsers and versions is left as an exercise for the reader.

Note: As pointed out by CMS, for certain cases of objects such as Number, Boolean and custom ones the + operator doesn't necessarily produce a string result. It can vary depending on the implementation of object to primitive conversion. For example var o = { valueOf:function () { return 4; } }; evaluating o + 2; produces 6, a number, evaluating o + '2' produces '42', a string.

To see how the overview table was generated visit http://jsfiddle.net/4EjXd/

Given that strings are immutable in .NET, I'm wondering why they have been designed such that string.Substring() takes O(substring.Length) time, instead of O(1)?

i.e. what were the tradeoffs, if any?

Answered By: Eric Lippert ( 270)

UPDATE: I liked this question so much, I just blogged it. See http://blogs.msdn.com/b/ericlippert/archive/2011/07/19/strings-immutability-and-persistence.aspx


The short answer is: O(n) is O(1) if n does not grow large. Most people extract tiny substrings from tiny strings, so how the complexity grows asymptotically is completely irrelevant.

The long answer is:

An immutable data structure built such that operations on an instance permit re-use of the memory of the original with only a small amount (typically O(1) or O(lg n)) of copying or new allocation is called a "persistent" immutable data structure. Strings in .NET are immutable; your question is essentially "why are they not persistent"?

Because when you look at operations that are typically done on strings in .NET programs, it is in every relevant way hardly worse at all to simply make an entirely new string. The expense and difficulty of building a complex persistent data structure doesn't pay for itself.

People typically use "substring" to extract a short string -- say, ten or twenty characters -- out of a somewhat longer string -- maybe a couple hundred characters. You have a line of text in a comma-separated file and you want to extract the third field, which is a last name. The line will be maybe a couple hundred characters long, the name will be a couple dozen. String allocation and memory copying of fifty bytes is astonishingly fast on modern hardware. That making a new data structure that consists of a pointer to the middle of an existing string plus a length is also astonishingly fast is irrelevant; "fast enough" is by definition fast enough.

The substrings extracted are typically small in size and short in lifetime; the garbage collector is going to reclaim them soon, and they didn't take up much room on the heap in the first place. So using a persistent strategy that encourages reuse of most of the memory is also not a win; all you've done is made your garbage collector get slower because now it has to worry about handling interior pointers.

If the substring operations people typically did on strings were completely different, then it would make sense to go with a persistent approach. If people typically had million-character strings, and were extracting thousands of overlapping substrings with sizes in the hundred-thousand-character range, and those substrings lived a long time on the heap, then it would make perfect sense to go with a persistent substring approach; it would be wasteful and foolish not to. But most line-of-business programmers do not do anything even vaguely like those sorts of things. .NET is not a platform that is tailored for the needs of the Human Genome Project; DNA analysis programmers have to solve problems with those string usage characteristics every day; odds are good that you do not. The few who do build their own persistent data structures that closely match their usage scenarios.

For example, my team writes programs that do on-the-fly analysis of C# and VB code as you type it. Some of those code files are enormous and thus we cannot be doing O(n) string manipulation to extract substrings or insert or delete characters. We have built a bunch of persistent immutable data structures for representing edits to a text buffer that permit us to quickly and efficiently re-use the bulk of the existing string data and the existing lexical and syntactic analyses upon a typical edit. This was a hard problem to solve and its solution was narrowly tailored to the specific domain of C# and VB code editing. It would be unrealistic to expect the built-in string type to solve this problem for us.